The article introduces Qwen3, a new generation hybrid reasoning model in the Qwen series, now available as open source. Qwen3 achieves highly competitive results in several authoritative evaluations such as GPQA, AIME24/25, and LiveCodeBench. By introducing the innovative MOE (Mixture of Experts) architecture, Qwen3 achieves comparable performance to the previous generation of ultra-large-scale Dense models while significantly improving efficiency and reducing computational costs. Qwen3 integrates reasoning and non-reasoning capabilities, excelling in tasks such as logical analysis and creative generation. In addition, Qwen3 features both 'thinking' and 'non-thinking' modes, optimizing performance for different scenarios. The 'thinking' mode allows for in-depth analysis of complex problems, while the 'non-thinking' mode prioritizes speed in daily conversations. The article also provides sample code for using Qwen3 in Hugging Face transformers and ModelScope, as well as methods for deploying using SGLang, vLLM, and ollama. Among them, SGLang is suitable for rapid deployment, vLLM is suitable for high-throughput scenarios, and ollama is suitable for local development. The article also demonstrates the use of Qwen-Agent, making it easy for users to perform tool calls. Finally, the article provides links to Qwen3 on Hugging Face, ModelScope, and Alibaba Cloud BaiLian.








