New Open Source Models on the Same Day: One for Inference, One for Programming - MiniMax and Moonshot AI Kick Off Showdown | Synced

The article reports that two major vendors in China's AI sector, MiniMax and Moonshot AI, open sourced their new models on the same day. MiniMax open sourced its latest long context inference LLM, MiniMax-M1. The model supports the world's longest context window, featuring 1 million tokens of input and 80,000 tokens of output, claiming to have the strongest agent tool usage capability among open source models. The article details its architecture based on MoE and Flash Attention mechanism, the innovative CISPO reinforcement learning algorithm, and its excellent performance in benchmark tests including programming and long context. Moonshot AI released Kimi-Dev-72B, an open source large model specialized in programming. This model set a new SOTA record for open source models on the code generation benchmark SWE-bench Verified. The article explains its technical details such as the BugFixer and TestWriter collaboration mechanism, mid-term training, outcome-based reinforcement learning, and self-play during testing. The article concludes with a comparison of the preliminary performance of the two models through a practical code test case, and provides links to their respective open source repositories and future plans.