News

Moonshot AI Drops Kimi K2.7-Code: Smarter Software Engineering That Won't Break the Bank

Moonshot AI has launched Kimi K2.7-Code, an open-weight coding model designed to maximize token efficiency. By slashing thinking tokens by 30% while matching Kimi K2.6 pricing, it offers developers a highly affordable alternative to heavyweights like GLM 5.1 and DeepSeek V4 Pro.

Erick Johnson

12 Jun 2026 • 3 min read

Moonshot AI just launched Kimi K2.7-Code on Hugging Face, pitching it as a highly streamlined, open-weight option for developers who want agentic coding without the usual resource drain. If you have been keeping tabs on the rapid-fire releases coming out of the open-weight space this year, you know the competition is fierce.

Rather than deploying a completely new ground-up frontier architecture, Moonshot built Kimi K2.7-Code as a specialized off-shoot of its existing Kimi K2.6 model. It uses a Mixture-of-Experts setup with 1 trillion total parameters, though only 32 billion are activated per token. The core mission here is to take Kimi's strong foundational long-context capabilities and optimize them specifically for end-to-end software engineering workflows.

Where Kimi K2.7-Code Cuts the Fat

The biggest upgrade over Kimi K2.6 is pure token efficiency. Moonshot claims the new model slashes thinking-token usage by roughly 30 percent while actually improving performance on complex, long-horizon programming tasks.

If you have used Kimi K2.6, you know it can sometimes get bogged down in its own reasoning loop, running up API costs or lengthening inference times. By optimizing the reasoning traces, Kimi K2.7-Code hits the same or better solutions with fewer tool calls. Early community testing on forums suggests the model is less prone to the annoying habit of commenting out failing tests instead of fixing them, a frequent complaint lodged against the older Kimi K2.5 and K2.6 iterations.

The Price is Right

For developers running high-throughput pipelines, the economic angle is probably the biggest selling point. Moonshot kept the API pricing identical to Kimi K2.6. When you combine the flat pricing structure with the 30 percent reduction in thinking tokens, the real-world cost to build or debug features drops significantly. It positions the model as a massive budget-saver compared to closed-source options.

Stacking Up: Kimi vs GLM 5.1 and DeepSeek V4 Pro

How does this new off-shoot hold up against the two major open-weight titans? It depends heavily on your specific deployment constraints and what you value most in an AI partner.

GLM 5.1: The Premium Accuracy Standard

Z.ai’s GLM 5.1 currently commands massive respect on leadership boards for its sheer code quality. Developers note that GLM 5.1 is incredibly reliable at catching deep structural logic flaws that even top-tier closed models occasionally miss. However, running GLM 5.1 locally is a massive headache. It demands at least four H200 GPUs just to run in a quantized INT4 state, and eight H200s for unquantized FP8 execution. Kimi K2.7-Code cannot quite match GLM 5.1's raw architectural polish, but it handles long context with significantly lower hardware demands.

DeepSeek V4 Pro: The Context Monster

DeepSeek V4 Pro is a beast built for massive context windows and heavy agentic orchestration. It is incredibly cheap for its size, but users frequently complain that it suffers from extreme verbosity. It can spend paragraphs over-analyzing a user's prompt, and when context windows stretch past 20,000 tokens, some developers report that it begins forgetting earlier details. DeepSeek also has a habit of cluttering codebases with silent try-catch blocks that are tough to debug.

Kimi K2.7-Code sits comfortably in the middle. With its 256K context window, it avoids DeepSeek’s forgetfulness while maintaining a tighter, cleaner output format.

The Editorial Verdict

Some developers on Reddit have pointed out that Moonshot evaluated Kimi K2.7-Code using an unusual, proprietary benchmark suite rather than industry standards like SWE-bench Pro. While that means we should take official performance charts with a grain of salt, the practical improvements in token efficiency are tough to ignore.

Kimi K2.7-Code isn't trying to dethrone the next generation of absolute frontier models. Instead, it is a highly targeted, pragmatic update to Kimi K2.6 that makes autonomous software agents much cheaper to run in production.