Home / Moonshot

Moonshot

Text

Kimi K2 Instruct

Kimi K2 is a state-of-the-art mixture-of-experts (MoE) language model with 32 billion active parameters and 1 trillion total parameters. Trained using the Muon optimizer, Kimi K2 delivers exceptional performance across frontier knowledge, reasoning, and coding tasks, while being meticulously optimized for agentic capabilities. It is specifically designed for tool use, reasoning, and autonomous problem-solving.

Text

Kimi K2.5

Kimi K2.5 is the latest flagship iteration of Moonshot AI’s large language model series, representing a significant leap in multimodal and agentic capabilities. It features a native multimodal architecture that supports both visual and text inputs, alongside versatile thinking and non-thinking modes. This model retains the substantial 256k-token context window found in the K2 series but achieves new open-source state-of-the-art (SoTA) performance across general intelligence, coding, and visual understanding benchmarks. Kimi K2.5 delivers a breakthrough in frontend development, enabling the generation of fully functional, aesthetically polished interactive interfaces with complex dynamic layouts directly from natural language. Optimized for complex problem-solving, it excels in multi-step tool invocation, logical reasoning, and full-stack code synthesis.

Text

Kimi K2 0905

Kimi K2 0905 is the September update to Kimi K2 0711. It is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring a total of 1 trillion parameters, with 32 billion active during each forward pass. It supports long-context inference of up to 256k tokens, an increase from the previous 128k. This update improves agentic coding with higher accuracy and better generalization across scaffolds, and enhances frontend coding with more aesthetically pleasing and functional outputs for web, 3D, and related tasks. Kimi K2 is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. It excels across coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) benchmarks. The model is trained using a novel stack that incorporates the MuonClip optimizer for stable, large-scale MoE training.

Contact Us