GLM-4-9B-Chat

Architecture

Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture: dense
Total params: 9B
Active params: 9B
Context window: 131K tokens
Attention: gqa
Position encoding: rope
Post-training: sft, rlhf
OSI-approved: no
Data released: no
Training code: not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU

72.4

as of 2024-06-05

source ↗

Code

HumanEval

71.8

as of 2024-06-05

source ↗

Math

MATH

50.6

as of 2024-06-05

source ↗

Held-out / arena

IFEval

69.0

as of 2024-06-05

source ↗

Recommended use cases

long-context summarization
agentic tool use
Chinese-English bilingual chat

Available quantizations

GGUF llama.cpp's container; the common local format, k-quants from Q2 to Q8. runs on llama.cpp, Ollama

Verified via the Hugging Face model tree ↗. Community quantizations change over time; the families shown are those with published weights at audit time.

Notable innovations

· 128K context at 9B scale
· Native tool use and function calling
· 26-language coverage

Known limitations

· GLM-4 Model License is source-available, not OSI-approved. source ↗

Lineage

Descended from the GLM-130B and ChatGLM lines at Tsinghua's KEG lab; the 1M-context sibling extends GLM-4-9B-Chat-1M.

Architecture

Specs

Benchmarks

General reasoning

Code

Math

Held-out / arena

Recommended use cases

Available quantizations

Notable innovations

Known limitations

Lineage

Sources