Qwen 2.5 14B Instruct — Apple Silicon Benchmarks
Measured inference speed for Qwen 2.5 14B Instruct across 52 Apple Silicon chips. Tokens per second at multiple quantization levels. Real runs, not estimates.
Quantizations measured: Q4_K - Medium
52
Benchmark rows
52
Chip tiers covered
36.7
Fastest avg tok/s (M3 Ultra (80-core GPU, 256 GB))
—
Minimum RAM observed
Benchmark results for Qwen 2.5 14B Instruct
Rows sorted by avg tok/s descending. Click source badge to see original measurement page.
| Chip | Quant | Avg tok/s | Runtime | Source |
|---|---|---|---|---|
| M3 Ultra (80-core GPU, 256 GB) | Q4_K - Medium | 36.7 tok/s | — | ref |
| M2 Ultra (76-core GPU, 128 GB) | Q4_K - Medium | 36.6 tok/s | — | ref |
| M3 Ultra (80-core GPU, 512 GB) | Q4_K - Medium | 35.8 tok/s | — | ref |
| M3 Ultra (60-core GPU, 96 GB) | Q4_K - Medium | 34.4 tok/s | — | ref |
| M2 Ultra (60-core GPU, 64 GB) | Q4_K - Medium | 34.2 tok/s | — | ref |
| M1 Ultra (64-core GPU, 128 GB) | Q4_K - Medium | 32.4 tok/s | — | ref |
| M4 Max (40-core GPU, 48 GB) | Q4_K - Medium | 30.1 tok/s | — | ref |
| M4 Max (40-core GPU, 128 GB) | Q4_K - Medium | 28.7 tok/s | — | ref |
| M1 Ultra (48-core GPU, 128 GB) | Q4_K - Medium | 27.8 tok/s | — | ref |
| M4 Max (40-core GPU, 64 GB) | Q4_K - Medium | 27.7 tok/s | — | ref |
| M3 Max (40-core GPU, 128 GB) | Q4_K - Medium | 25.5 tok/s | — | ref |
| M2 Max (38-core GPU, 96 GB) | Q4_K - Medium | 25.2 tok/s | — | ref |
| M4 Max (32-core GPU, 36 GB) | Q4_K - Medium | 24.6 tok/s | — | ref |
| M2 Max (38-core GPU, 64 GB) | Q4_K - Medium | 22.0 tok/s | — | ref |
| M3 Max (30-core GPU, 96 GB) | Q4_K - Medium | 20.8 tok/s | — | ref |
| M2 Max (38-core GPU, 32 GB) | Q4_K - Medium | 20.6 tok/s | — | ref |
| M1 Max (32-core GPU, 32 GB) | Q4_K - Medium | 20.1 tok/s | — | ref |
| M3 Max (30-core GPU, 36 GB) | Q4_K - Medium | 19.8 tok/s | — | ref |
| M1 Max (32-core GPU, 64 GB) | Q4_K - Medium | 19.0 tok/s | — | ref |
| M4 Pro (20-core GPU, 64 GB) | Q4_K - Medium | 18.0 tok/s | — | ref |
| M4 Pro (20-core GPU, 24 GB) | Q4_K - Medium | 18.0 tok/s | — | ref |
| M4 Pro (20-core GPU, 48 GB) | Q4_K - Medium | 18.0 tok/s | — | ref |
| M1 Max (24-core GPU, 32 GB) | Q4_K - Medium | 17.4 tok/s | — | ref |
| M4 Pro (16-core GPU, 48 GB) | Q4_K - Medium | 16.8 tok/s | — | ref |
| M4 Pro (16-core GPU, 64 GB) | Q4_K - Medium | 16.1 tok/s | — | ref |
| M4 Pro (16-core GPU, 24 GB) | Q4_K - Medium | 15.2 tok/s | — | ref |
| M1 Max (24-core GPU, 64 GB) | Q4_K - Medium | 15.1 tok/s | — | ref |
| M2 Max (30-core GPU, 64 GB) | Q4_K - Medium | 14.5 tok/s | — | ref |
| M2 Pro (19-core GPU, 32 GB) | Q4_K - Medium | 14.1 tok/s | — | ref |
| M3 Max (40-core GPU, 64 GB) | Q4_K - Medium | 13.8 tok/s | — | ref |
| M2 Pro (16-core GPU, 16 GB) | Q4_K - Medium | 13.4 tok/s | — | ref |
| M3 Pro (14-core GPU, 36 GB) | Q4_K - Medium | 12.1 tok/s | — | ref |
| M3 Pro (18-core GPU, 36 GB) | Q4_K - Medium | 12.0 tok/s | — | ref |
| M1 Pro (16-core GPU, 16 GB) | Q4_K - Medium | 11.9 tok/s | — | ref |
| M3 Pro (14-core GPU, 18 GB) | Q4_K - Medium | 11.9 tok/s | — | ref |
| M3 Pro (18-core GPU, 18 GB) | Q4_K - Medium | 11.6 tok/s | — | ref |
| M1 Pro (16-core GPU, 32 GB) | Q4_K - Medium | 11.6 tok/s | — | ref |
| M5 (10-core GPU, 32 GB) | Q4_K - Medium | 11.5 tok/s | — | ref |
| M1 Pro (14-core GPU, 16 GB) | Q4_K - Medium | 10.8 tok/s | — | ref |
| M1 Pro (14-core GPU, 32 GB) | Q4_K - Medium | 10.4 tok/s | — | ref |
| M4 (10-core GPU, 24 GB) | Q4_K - Medium | 9.2 tok/s | — | ref |
| M4 (10-core GPU, 16 GB) | Q4_K - Medium | 8.7 tok/s | — | ref |
| M4 (10-core GPU, 32 GB) | Q4_K - Medium | 8.6 tok/s | — | ref |
| M2 (10-core GPU, 16 GB) | Q4_K - Medium | 8.1 tok/s | — | ref |
| M1 Ultra (GPU count not published, 128 GB) | Q4_K - Medium | 8.0 tok/s | — | ref |
| M2 (10-core GPU, 24 GB) | Q4_K - Medium | 7.3 tok/s | — | ref |
| M4 (8-core GPU, 16 GB) | Q4_K - Medium | 7.2 tok/s | — | ref |
| M2 (8-core GPU, 16 GB) | Q4_K - Medium | 7.0 tok/s | — | ref |
| M3 (10-core GPU, 24 GB) | Q4_K - Medium | 6.1 tok/s | — | ref |
| M1 (8-core GPU, 16 GB) | Q4_K - Medium | 5.4 tok/s | — | ref |
| M1 (7-core GPU, 16 GB) | Q4_K - Medium | 4.8 tok/s | — | ref |
| M3 (10-core GPU, 16 GB) | Q4_K - Medium | — | — | ref |
Chips with published results for Qwen 2.5 14B Instruct
M1 (7-core GPU, 16 GB)
M1 (8-core GPU, 16 GB)
M1 Max (24-core GPU, 32 GB)
M1 Max (24-core GPU, 64 GB)
M1 Max (32-core GPU, 32 GB)
M1 Max (32-core GPU, 64 GB)
M1 Pro (14-core GPU, 16 GB)
M1 Pro (14-core GPU, 32 GB)
M1 Pro (16-core GPU, 16 GB)
M1 Pro (16-core GPU, 32 GB)
M1 Ultra (48-core GPU, 128 GB)
M1 Ultra (64-core GPU, 128 GB)
M1 Ultra (GPU count not published, 128 GB)
M2 (8-core GPU, 16 GB)
M2 (10-core GPU, 16 GB)
M2 (10-core GPU, 24 GB)
M2 Max (30-core GPU, 64 GB)
M2 Max (38-core GPU, 32 GB)
M2 Max (38-core GPU, 64 GB)
M2 Max (38-core GPU, 96 GB)
M2 Pro (16-core GPU, 16 GB)
M2 Pro (19-core GPU, 32 GB)
M2 Ultra (60-core GPU, 64 GB)
M2 Ultra (76-core GPU, 128 GB)
M3 (10-core GPU, 16 GB)
M3 (10-core GPU, 24 GB)
M3 Max (30-core GPU, 36 GB)
M3 Max (30-core GPU, 96 GB)
M3 Max (40-core GPU, 64 GB)
M3 Max (40-core GPU, 128 GB)
M3 Pro (14-core GPU, 18 GB)
M3 Pro (14-core GPU, 36 GB)
M3 Pro (18-core GPU, 18 GB)
M3 Pro (18-core GPU, 36 GB)
M3 Ultra (60-core GPU, 96 GB)
M3 Ultra (80-core GPU, 256 GB)
M3 Ultra (80-core GPU, 512 GB)
M4 (8-core GPU, 16 GB)
M4 (10-core GPU, 16 GB)
M4 (10-core GPU, 24 GB)
M4 (10-core GPU, 32 GB)
M4 Max (32-core GPU, 36 GB)
M4 Max (40-core GPU, 48 GB)
M4 Max (40-core GPU, 64 GB)
M4 Max (40-core GPU, 128 GB)
M4 Pro (16-core GPU, 24 GB)
M4 Pro (16-core GPU, 48 GB)
M4 Pro (16-core GPU, 64 GB)
M4 Pro (20-core GPU, 24 GB)
M4 Pro (20-core GPU, 48 GB)
M4 Pro (20-core GPU, 64 GB)
M5 (10-core GPU, 32 GB)
Data
benchmarks.json — full dataset · models.json — model summaries · benchmarks.csv — CSV export