Qwen 3 32B — Apple Silicon Benchmarks
Measured inference speed for Qwen 3 32B across 2 Apple Silicon chips. Tokens per second at multiple quantization levels. Real runs, not estimates.
Quantizations measured: Q4_K_M, iQ2_K_S
2
Benchmark rows
2
Chip tiers covered
22.0
Fastest avg tok/s (M4 Max (40-core GPU, 64 GB))
11 GB
Minimum RAM observed
Benchmark results for Qwen 3 32B
Rows sorted by avg tok/s descending. Click source badge to see original measurement page.
| Chip | Quant | Avg tok/s | Runtime | Source |
|---|---|---|---|---|
| M4 Max (40-core GPU, 64 GB) | Q4_K_M | 22.0 tok/s | factory harness | factory lab |
| M4 Max (32-core GPU) | iQ2_K_S | 13.2 tok/s | — | ref |
Chips with published results for Qwen 3 32B
Data
benchmarks.json — full dataset · models.json — model summaries · benchmarks.csv — CSV export