The Open-Source AI Stack
RSS
All models

Models · tulu

Llama 3.1 Tülu 3 70B

Open weights AI2 · 2024-11-21 · Llama 3.1 Community License

AI2's flagship demonstration that the open community could match closed instruct recipes. Post-trained on top of Llama 3.1 70B with SFT, DPO, and the new RLVR (Reinforcement Learning with Verifiable Rewards) stage. Recipes, data, code, and infrastructure all open even though the weights carry Llama Community License inherited from the base.

Architecture

tokens in Embedding vocab not disclosed · llama3 tokenizer × N layers Grouped-Query Attention RoPE (Llama 3 scaling) context 8,192 tokens Dense MLP SwiGLU activation (standard) 70B active params Output projection tokens out
Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture
dense
Total params
70B
Active params
70B
Context window
not verified
Attention
gqa
Position encoding
rope-llama3
Post-training
sft, dpo, rlvr
OSI-approved
no
Data released
yes
Training code
released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU 83.1 as of 2024-11-21 source ↗

Code

HumanEval 92.4 as of 2024-11-21 source ↗

Math

MATH 63.0 as of 2024-11-21 source ↗

Held-out / arena

IFEval 83.2 as of 2024-11-21 source ↗

Recommended use cases

  • open instruct baseline
  • post-training research reproducibility
  • RLVR experimentation

Available quantizations

GGUF llama.cpp's container; the common local format, k-quants from Q2 to Q8. runs on llama.cpp, Ollama
AWQ Activation-aware 4-bit weight quantization for GPU serving. runs on vLLM, SGLang
EXL2 ExLlamaV2's variable-bitrate format for consumer GPUs. runs on ExLlamaV2
MLX Apple MLX 4/8-bit layout for Apple silicon. runs on Apple MLX

Verified via the Hugging Face model tree ↗. Community quantizations change over time; the families shown are those with published weights at audit time.

Notable innovations

  • · RLVR (Reinforcement Learning with Verifiable Rewards)
  • · Fully open post-training recipe
  • · Open instruct match to closed Llama 3.1 70B Instruct

Known limitations

  • · Weights are released under the Llama 3.1 Community License, not Apache; the Tülu 3 code and data are Apache but the checkpoint itself inherits Meta's license. source ↗

Lineage

Post-trained from Llama 3.1 70B base; fully open recipe (SFT + DPO + RLVR). Weights inherit the Llama 3.1 Community License from the base.

Sources