The Open-Source AI Stack
RSS
All models

Models · Compare

Qwen3-VL 235B-A22B Instruct vs Qwen3 Max

Rows highlighted in warm gray are where the models differ. Numbers carry their as-of date and primary source.

Specs

Field A: Qwen3-VL 235B-A22B Instruct B: Qwen3 Max
Released 2025-09-23
Developer AlibabaAlibaba
Openness OpenProprietary
License Apache-2.0Proprietary
OSI-approved yesno
Data released nono
Training code nono
Architecture moemoe
Total params 235B
Active params 22B
Experts
Context window 262K
Attention mrope-interleaved
Position enc. rope-interleaved
Pretraining tokens
Post-training sft, rlhfsft, rlhf
Training hardware
$/M input $0.30$1.66
$/M output $1.90$7.22
Output tok/sec 50.932.4

Benchmarks

Missing scores render as not reported; never inferred. Bold highlights the leader per benchmark.

General reasoning

MMLU-Pro 82.3 2026-05-21 84.1 2026-05-21
GPQA-Diamond 71.2 2026-05-21 76.4 2026-05-21

Code

LiveCodeBench 59.4 2026-05-21

Math

AIME 2025 70.7 2026-05-21

Context · A

Vision-language flagship of the Qwen3 line, 235B total weights (about 471 GB). Adds Interleaved-MRoPE for video reasoning, DeepStack multi-level ViT feature fusion, and Text-Timestamp Alignment for grounded event localization. Shipped with both Instruct and Thinking variants under Apache 2.0; native context to 256K, extensible to 1M.

Context · B

Trillion-parameter MoE, API-only via Qwen Chat and Alibaba Cloud at release. Pretrained on roughly 36T tokens. Alibaba's first proprietary Qwen flagship at this scale, breaking the open-weights pattern. Supports more than 100 languages.

Qwen3-VL 235B-A22B Instruct detail → · Qwen3 Max detail →