The Open-Source AI Stack
RSS
All models

Models · glm

GLM-4-9B-Chat

Source-available Zhipu AI · 2024-06-05 · GLM-4 Model License

Zhipu AI and THUDM's GLM-4-9B-Chat brought 128K context, tool use, and multilingual coverage to a 9B-class open-weights model. A 1M-context variant shipped alongside the base 128K version. The GLM-4 model license is source-available rather than OSI-approved.

Architecture

tokens in Embedding vocab not disclosed · glm-4 tokenizer × N layers Grouped-Query Attention RoPE context 131,072 tokens Dense MLP SwiGLU activation (standard) 9B active params Output projection tokens out
Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture
dense
Total params
9B
Active params
9B
Context window
131K tokens
Attention
gqa
Position encoding
rope
Post-training
sft, rlhf
OSI-approved
no
Data released
no
Training code
not released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU 72.4 as of 2024-06-05 source ↗

Code

HumanEval 71.8 as of 2024-06-05 source ↗

Math

MATH 50.6 as of 2024-06-05 source ↗

Held-out / arena

IFEval 69.0 as of 2024-06-05 source ↗

Recommended use cases

  • long-context summarization
  • agentic tool use
  • Chinese-English bilingual chat

Available quantizations

GGUF llama.cpp's container; the common local format, k-quants from Q2 to Q8. runs on llama.cpp, Ollama

Verified via the Hugging Face model tree ↗. Community quantizations change over time; the families shown are those with published weights at audit time.

Notable innovations

  • · 128K context at 9B scale
  • · Native tool use and function calling
  • · 26-language coverage

Known limitations

  • · GLM-4 Model License is source-available, not OSI-approved. source ↗

Lineage

Descended from the GLM-130B and ChatGLM lines at Tsinghua's KEG lab; the 1M-context sibling extends GLM-4-9B-Chat-1M.

Sources