OpenAI Unveils o1 Large Model: Reinforcement Learning Pushes LLM Reasoning to New Heights

OpenAI released the o1 large model on September 13, 2024, showcasing a significant leap in complex reasoning capabilities achieved through reinforcement learning training. The o1 model demonstrated its prowess in benchmark tasks across physics, chemistry, biology, mathematics, and programming. Notably, it correctly answered 83% of the questions in the International Mathematical Olympiad qualifying exam, surpassing GPT-4o's 13% success rate. Furthermore, o1 outperformed GPT-4o in programming competitions and even surpassed human experts in certain benchmark tests. OpenAI also introduced a cost-effective and faster version, o1-mini, specifically designed for programming. The article delves into the working principles, evaluation results, and future development directions of the o1 model.