← All benchmarks

Qwen 3 30B A3B — Apple Silicon Benchmarks

Measured inference speed for Qwen 3 30B A3B across 2 Apple Silicon chips. Tokens per second at multiple quantization levels. Real runs, not estimates.

Quantizations measured: Q4, Q5, Q6, Q4_K_M, Q8

5 Benchmark rows
2 Chip tiers covered
92.1 Fastest avg tok/s (M4 Max (40-core GPU, 64 GB))
16.12 GB Minimum RAM observed

Benchmark results for Qwen 3 30B A3B

Rows sorted by avg tok/s descending. Click source badge to see original measurement page.

Chip Quant RAM req. Context Avg tok/s Prompt tok/s Runtime Source
M4 Max (40-core GPU, 64 GB) Q4 16.1 GB 2k 92.1 tok/s 822.6 tok/s MLX ref
M4 Max (40-core GPU, 64 GB) Q5 18.1 GB 2k 84.9 tok/s 819.8 tok/s MLX ref
M4 Max (40-core GPU, 64 GB) Q6 21.9 GB 2k 76.7 tok/s 817.6 tok/s MLX ref
M4 Max (128 GB) Q4_K_M 10k 70.2 tok/s LM Studio ref
M4 Max (40-core GPU, 64 GB) Q8 29.8 GB 2k 52.6 tok/s 772.6 tok/s MLX ref

benchmarks.json — full dataset  ·  models.json — model summaries  ·  benchmarks.csv — CSV export

See all models →