Llama 3.1 Tülu 3 70B

Architecture

Schema-generated from data/models.yaml. Every label is auditable against the model's sources.

Specs

Architecture: dense
Total params: 70B
Active params: 70B
Context window: not verified
Attention: gqa
Position encoding: rope-llama3
Post-training: sft, dpo, rlvr
OSI-approved: no
Data released: yes
Training code: released

Benchmarks

Each score carries the date it was published; we never infer or interpolate missing scores.

General reasoning

MMLU

83.1

as of 2024-11-21

source ↗

Code

HumanEval

92.4

as of 2024-11-21

source ↗

Math

MATH

63.0

as of 2024-11-21

source ↗

Held-out / arena

IFEval

83.2

as of 2024-11-21

source ↗

Recommended use cases

open instruct baseline
post-training research reproducibility
RLVR experimentation

Available quantizations

GGUF llama.cpp's container; the common local format, k-quants from Q2 to Q8. runs on llama.cpp, Ollama

AWQ Activation-aware 4-bit weight quantization for GPU serving. runs on vLLM, SGLang

EXL2 ExLlamaV2's variable-bitrate format for consumer GPUs. runs on ExLlamaV2

MLX Apple MLX 4/8-bit layout for Apple silicon. runs on Apple MLX

Verified via the Hugging Face model tree ↗. Community quantizations change over time; the families shown are those with published weights at audit time.

Notable innovations

· RLVR (Reinforcement Learning with Verifiable Rewards)
· Fully open post-training recipe
· Open instruct match to closed Llama 3.1 70B Instruct

Known limitations

· Weights are released under the Llama 3.1 Community License, not Apache; the Tülu 3 code and data are Apache but the checkpoint itself inherits Meta's license. source ↗

Lineage

Post-trained from Llama 3.1 70B base; fully open recipe (SFT + DPO + RLVR). Weights inherit the Llama 3.1 Community License from the base.

Architecture

Specs

Benchmarks

General reasoning

Code

Math

Held-out / arena

Recommended use cases

Available quantizations

Notable innovations

Known limitations

Lineage

Sources