z.ai's open source GLM-5 achieves record low hallucination rate and leverages new RL 'slime' technique
Chinese AI startup Zhupai, operating under the z.ai banner, has once again stunned the tech world with the unveiling of its latest large language model: GLM-5. This frontier AI model doesn’t just push boundaries—it obliterates them, setting new records in performance, cost-efficiency, and enterprise readiness.
GLM-5 is the latest iteration in z.ai’s impressive GLM series, and it comes with an MIT License, making it an ideal candidate for enterprise deployment. One of its most notable achievements is achieving a record-low hallucination rate on the independent Artificial Analysis Intelligence Index v4.0, scoring a -1 on the AA-Omniscience Index. This represents a staggering 35-point improvement over its predecessor and places GLM-5 at the top of the AI industry in terms of knowledge reliability. In fact, it now outperforms U.S. giants like Google, OpenAI, and Anthropic by knowing when to abstain rather than fabricate information.
But GLM-5 isn’t just about reliability—it’s built for high-utility knowledge work. The model features native “Agent Mode” capabilities, allowing it to transform raw prompts or source materials directly into professional office documents. Whether it’s generating detailed financial reports, high school sponsorship proposals, or complex spreadsheets, GLM-5 delivers results in real-world formats like .docx, .pdf, and .xlsx files that integrate seamlessly into enterprise workflows.
And here’s the kicker: GLM-5 is disruptively priced at roughly $0.80 per million input tokens and $2.56 per million output tokens. That’s approximately 6x cheaper than proprietary competitors like Claude Opus 4.6, making state-of-the-art agentic engineering more cost-effective than ever before.
At the heart of GLM-5 is a massive leap in raw parameters. The model scales from the 355B parameters of GLM-4.5 to a staggering 744B parameters, with 40B active per token in its Mixture-of-Experts (MoE) architecture. This growth is supported by an increase in pre-training data to 28.5T tokens. To address training inefficiencies at this magnitude, Zai developed “slime,” a novel asynchronous reinforcement learning (RL) infrastructure. Traditional RL often suffers from “long-tail” bottlenecks, but Slime breaks this lockstep by allowing trajectories to be generated independently, enabling the fine-grained iterations necessary for complex agentic behavior.
GLM-5 is also built for end-to-end knowledge work. Zai is framing it as an “office” tool for the AGI era. While previous models focused on snippets, GLM-5 is designed to deliver ready-to-use documents. It can autonomously transform prompts into formatted .docx, .pdf, and .xlsx files—ranging from financial reports to sponsorship proposals. In practice, this means the model can decompose high-level goals into actionable subtasks and perform “Agentic Engineering,” where humans define quality gates while the AI handles execution.
In terms of performance, GLM-5’s benchmarks make it the new most powerful open-source model in the world, according to Artificial Analysis. It surpasses Chinese rival Moonshot’s new Kimi K2.5, released just two weeks ago, showing that Chinese AI companies are nearly caught up with far better-resourced proprietary Western rivals. According to z.ai’s own materials, GLM-5 ranks near state-of-the-art on several key benchmarks:
- SWE-bench Verified: GLM-5 achieved a score of 77.8, outperforming Gemini 3 Pro (76.2) and approaching Claude Opus 4.6 (80.9).
- Vending Bench 2: In a simulation of running a business, GLM-5 ranked #1 among open-source models with a final balance of $4,432.12.
Beyond performance, GLM-5 is aggressively undercutting the market. Live on OpenRouter as of February 11, 2026, it is priced at approximately $0.80–$1.00 per million input tokens and $2.56–$3.20 per million output tokens. It falls in the mid-range compared to other leading LLMs, but based on its top-tier benchmarking performance, it’s what one might call a “steal.”
However, despite the high benchmarks and low cost, not all early users are enthusiastic about the model. Lukas Petersson, co-founder of the safety-focused autonomous AI protocol startup Andon Labs, remarked on X: “After hours of reading GLM-5 traces: an incredibly effective model, but far less situationally aware. Achieves goals via aggressive tactics but doesn’t reason about its situation or leverage experience. This is scary. This is how you get a paperclip maximizer.”
The “paperclip maximizer” refers to a hypothetical situation described by Oxford philosopher Nick Bostrom back in 2003, in which an AI or other autonomous creation accidentally leads to an apocalyptic scenario or human extinction by following a seemingly benign instruction—like maximizing the number of paperclips produced—to an extreme degree, redirecting all resources necessary for human (or other life) or otherwise making life impossible through its commitment to fulfilling the seemingly benign objective.
Should your enterprise adopt GLM-5? Enterprises seeking to escape vendor lock-in will find GLM-5’s MIT License and open-weights availability a significant strategic advantage. Unlike closed-source competitors that keep intelligence behind proprietary walls, GLM-5 allows organizations to host their own frontier-level intelligence.
Adoption is not without friction. The sheer scale of GLM-5—744B parameters—requires a massive hardware floor that may be out of reach for smaller firms without significant cloud or on-premise GPU clusters. Security leaders must weigh the geopolitical implications of a flagship model from a China-based lab, especially in regulated industries where data residency and provenance are strictly audited.
Furthermore, the shift toward more autonomous AI agents introduces new governance risks. As models move from “chat” to “work,” they begin to operate across apps and files autonomously. Without the robust agent-specific permissions and human-in-the-loop quality gates established by enterprise data leaders, the risk of autonomous error increases exponentially.
Ultimately, GLM-5 is a “buy” for organizations that have outgrown simple copilots and are ready to build a truly autonomous office. It is for engineers who need to refactor a legacy backend or require a “self-healing” pipeline that doesn’t sleep. While Western labs continue to optimize for “Thinking” and reasoning depth, Zai is optimizing for execution and scale. Enterprises that adopt GLM-5 today are not just buying a cheaper model; they are betting on a future where the most valuable AI is the one that can finish the project without being asked twice.
Tags & Viral Sentences:
- Chinese AI startup Zhupai
- GLM-5
- Record-low hallucination rate
- Agent Mode capabilities
- Disruptively priced
- 744B parameters
- Mixture-of-Experts (MoE) architecture
- slime
- Asynchronous reinforcement learning
- End-to-end knowledge work
- “Office” tool for the AGI era
- SWE-bench Verified
- Vending Bench 2
- “Paperclip maximizer”
- MIT License
- Open-weights availability
- Vendor lock-in
- Autonomous office
- Self-healing pipeline
- Execution and scale
- Future of AI
,



Leave a Reply
Want to join the discussion?Feel free to contribute!