Writing · 17.06.26 · 2 min

GLM-5.2: the most powerful text-only open weights LLM?

Editorial translation of Simon Willison's post on benchmarks, cost, and hands-on tests.

GLM-5.2: the most powerful text-only open weights LLM?

Source: GLM-5.2 is probably the most powerful text-only open weights LLM · Simon Willison · June 17, 2026

Z.ai released GLM-5.2 to coding plan subscribers June 13, then open weights under MIT June 16. 753B parameters, MoE with 40B active. Text input only—vision is a separate closed family. Context window: 1 million tokens (up from 200K on GLM-5.1).

GLM-5.2 open weights

Benchmarks and cost

Artificial Analysis ranks GLM-5.2 #1 among open weights on Intelligence Index v4.1 (score 51)—ahead of MiniMax-M3, DeepSeek V4 Pro, and Kimi K2.6.

Caveat: token hunger—~43K output tokens per Intelligence Index task vs 26K for GLM-5.1. Output pricing matters in production budgets.

#2 on Code Arena WebDev behind only Claude Fable 5—surprising without image input for frontend agentic coding.

On OpenRouter, most providers charge $1.40/M input, $4.40/M output vs GPT-5.5 at $5/$30 and Claude Opus at $5/$25.

Hands-on (Simon’s tests)

GLM-5.1 produced a beloved animated pelican SVG. GLM-5.2 improved the pelican prompt with intact animations.

The “NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER” prompt regressed vs GLM-5.1’s animated HTML+CSS—no animation, lower quality.

Lesson: leaderboard rank doesn’t uniform across task types.

For backend / LLM integration

Open weights, long context, cheap list pricing—worth trying self-hosted or via OpenRouter for RAG and agents. But ~43K tokens/task inflates production spend; enforce max output, rate limits, and eval gates.

Charity Majors’ recent point applies: code got cheaper to generate; engineering discipline matters more.

Model cards and downloads live on Z.ai and Artificial Analysis; Willison’s post is the practitioner filter.

GLM-5.2: the most powerful text-only open weights LLM? — Aziz Osmanoğlu