Models · Compare
GLM-4-9B-Chat vs Claude 3.5 Sonnet
Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.
Specs
| Field | A: GLM-4-9B-Chat | B: Claude 3.5 Sonnet |
|---|---|---|
| Released | 2024-06-05 | 2024-06-20 |
| Developer | Zhipu AI | Anthropic |
| Openness | Source-available | Proprietary |
| License | GLM-4 Model License | Proprietary |
| OSI-approved | no | no |
| Data released | no | no |
| Training code | no | no |
| Architecture | dense | unknown |
| Total params | 9B | — |
| Active params | — | — |
| Experts | — | — |
| Context window | 131K | 200K |
| Attention | gqa | unknown |
| Position enc. | rope | unknown |
| Pretraining tokens | — | — |
| Post-training | sft, rlhf | rlhf, constitutional |
| Training hardware | — | — |
| $/M input | — | — |
| $/M output | — | — |
| Output tok/sec | — | — |
Benchmarks
Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.
General reasoning
| MMLU | 72.4 2024-06-05 | — |
Code
| HumanEval | 71.8 2024-06-05 | — |
Math
| MATH | 50.6 2024-06-05 | — |
Held-out / arena
| IFEval | 69.0 2024-06-05 | — |
Context · A
Zhipu AI and THUDM's GLM-4-9B-Chat brought 128K context, tool use, and multilingual coverage to a 9B-class open-weights model. A 1M-context variant shipped alongside the base 128K version. The GLM-4 model license is source-available rather than OSI-approved.
Context · B
The first Claude release to beat its own larger sibling (Claude 3 Opus) on most benchmarks. Established Artifacts, driving a wave of code-and-canvas product copies.