Architecture
data/models.yaml. Every label is auditable
against the model's sources.
Specs
- Architecture
- moe
- Total params
- not disclosed
- Active params
- not disclosed
- Context window
- not verified
- Attention
- Position encoding
- Pretraining tokens
- 36.0T
- Post-training
- sft, rlhf
- OSI-approved
- no
- Data released
- no
- Training code
- not released
Benchmarks
Each score carries the date it was published; we never infer or interpolate missing scores.
Available quantizations
None. The weights are not distributed, so there are no public quantizations.
Notable innovations
- · First trillion-parameter Qwen
- · 36T-token pretraining run
- · Proprietary tier of the Qwen line
Lineage
Proprietary trillion-parameter Qwen variant.
Derived from
Qwen 3 235B A22B Instruct 2025-04-28