← All benchmarks

Qwen 2.5 14B Instruct — Apple Silicon Benchmarks

Measured inference speed for Qwen 2.5 14B Instruct across 52 Apple Silicon chips. Tokens per second at multiple quantization levels. Real runs, not estimates.

Quantizations measured: Q4_K - Medium

52 Benchmark rows
52 Chip tiers covered
36.7 Fastest avg tok/s (M3 Ultra (80-core GPU, 256 GB))
Minimum RAM observed

Benchmark results for Qwen 2.5 14B Instruct

Rows sorted by avg tok/s descending. Click source badge to see original measurement page.

Chip Quant RAM req. Context Avg tok/s Prompt tok/s Runtime Source
M3 Ultra (80-core GPU, 256 GB) Q4_K - Medium 36.7 tok/s 568.3 tok/s ref
M2 Ultra (76-core GPU, 128 GB) Q4_K - Medium 36.6 tok/s 470.5 tok/s ref
M3 Ultra (80-core GPU, 512 GB) Q4_K - Medium 35.8 tok/s 577.4 tok/s ref
M3 Ultra (60-core GPU, 96 GB) Q4_K - Medium 34.4 tok/s 444.6 tok/s ref
M2 Ultra (60-core GPU, 64 GB) Q4_K - Medium 34.2 tok/s 381.1 tok/s ref
M1 Ultra (64-core GPU, 128 GB) Q4_K - Medium 32.4 tok/s 371.7 tok/s ref
M4 Max (40-core GPU, 48 GB) Q4_K - Medium 30.1 tok/s 347.0 tok/s ref
M4 Max (40-core GPU, 128 GB) Q4_K - Medium 28.7 tok/s 326.6 tok/s ref
M1 Ultra (48-core GPU, 128 GB) Q4_K - Medium 27.8 tok/s 289.9 tok/s ref
M4 Max (40-core GPU, 64 GB) Q4_K - Medium 27.7 tok/s 306.5 tok/s ref
M3 Max (40-core GPU, 128 GB) Q4_K - Medium 25.5 tok/s 302.4 tok/s ref
M2 Max (38-core GPU, 96 GB) Q4_K - Medium 25.2 tok/s 252.5 tok/s ref
M4 Max (32-core GPU, 36 GB) Q4_K - Medium 24.6 tok/s 273.4 tok/s ref
M2 Max (38-core GPU, 64 GB) Q4_K - Medium 22.0 tok/s 222.8 tok/s ref
M3 Max (30-core GPU, 96 GB) Q4_K - Medium 20.8 tok/s 238.8 tok/s ref
M2 Max (38-core GPU, 32 GB) Q4_K - Medium 20.6 tok/s 225.6 tok/s ref
M1 Max (32-core GPU, 32 GB) Q4_K - Medium 20.1 tok/s 195.8 tok/s ref
M3 Max (30-core GPU, 36 GB) Q4_K - Medium 19.8 tok/s 226.2 tok/s ref
M1 Max (32-core GPU, 64 GB) Q4_K - Medium 19.0 tok/s 185.9 tok/s ref
M4 Pro (20-core GPU, 64 GB) Q4_K - Medium 18.0 tok/s 183.0 tok/s ref
M4 Pro (20-core GPU, 24 GB) Q4_K - Medium 18.0 tok/s 190.4 tok/s ref
M4 Pro (20-core GPU, 48 GB) Q4_K - Medium 18.0 tok/s 189.8 tok/s ref
M1 Max (24-core GPU, 32 GB) Q4_K - Medium 17.4 tok/s 155.9 tok/s ref
M4 Pro (16-core GPU, 48 GB) Q4_K - Medium 16.8 tok/s 161.1 tok/s ref
M4 Pro (16-core GPU, 64 GB) Q4_K - Medium 16.1 tok/s 151.0 tok/s ref
M4 Pro (16-core GPU, 24 GB) Q4_K - Medium 15.2 tok/s 144.3 tok/s ref
M1 Max (24-core GPU, 64 GB) Q4_K - Medium 15.1 tok/s 140.2 tok/s ref
M2 Max (30-core GPU, 64 GB) Q4_K - Medium 14.5 tok/s 149.0 tok/s ref
M2 Pro (19-core GPU, 32 GB) Q4_K - Medium 14.1 tok/s 137.3 tok/s ref
M3 Max (40-core GPU, 64 GB) Q4_K - Medium 13.8 tok/s 200.8 tok/s ref
M2 Pro (16-core GPU, 16 GB) Q4_K - Medium 13.4 tok/s 119.0 tok/s ref
M3 Pro (14-core GPU, 36 GB) Q4_K - Medium 12.1 tok/s 119.8 tok/s ref
M3 Pro (18-core GPU, 36 GB) Q4_K - Medium 12.0 tok/s 147.4 tok/s ref
M1 Pro (16-core GPU, 16 GB) Q4_K - Medium 11.9 tok/s 106.7 tok/s ref
M3 Pro (14-core GPU, 18 GB) Q4_K - Medium 11.9 tok/s 117.0 tok/s ref
M3 Pro (18-core GPU, 18 GB) Q4_K - Medium 11.6 tok/s 144.8 tok/s ref
M1 Pro (16-core GPU, 32 GB) Q4_K - Medium 11.6 tok/s 104.5 tok/s ref
M5 (10-core GPU, 32 GB) Q4_K - Medium 11.5 tok/s 110.4 tok/s ref
M1 Pro (14-core GPU, 16 GB) Q4_K - Medium 10.8 tok/s 92.5 tok/s ref
M1 Pro (14-core GPU, 32 GB) Q4_K - Medium 10.4 tok/s 88.9 tok/s ref
M4 (10-core GPU, 24 GB) Q4_K - Medium 9.2 tok/s 93.3 tok/s ref
M4 (10-core GPU, 16 GB) Q4_K - Medium 8.7 tok/s 83.1 tok/s ref
M4 (10-core GPU, 32 GB) Q4_K - Medium 8.6 tok/s 79.3 tok/s ref
M2 (10-core GPU, 16 GB) Q4_K - Medium 8.1 tok/s 74.8 tok/s ref
M1 Ultra (GPU count not published, 128 GB) Q4_K - Medium 8.0 tok/s 38.8 tok/s ref
M2 (10-core GPU, 24 GB) Q4_K - Medium 7.3 tok/s 70.3 tok/s ref
M4 (8-core GPU, 16 GB) Q4_K - Medium 7.2 tok/s 61.1 tok/s ref
M2 (8-core GPU, 16 GB) Q4_K - Medium 7.0 tok/s 60.0 tok/s ref
M3 (10-core GPU, 24 GB) Q4_K - Medium 6.1 tok/s 65.8 tok/s ref
M1 (8-core GPU, 16 GB) Q4_K - Medium 5.4 tok/s 53.3 tok/s ref
M1 (7-core GPU, 16 GB) Q4_K - Medium 4.8 tok/s 40.6 tok/s ref
M3 (10-core GPU, 16 GB) Q4_K - Medium 9786.5 tok/s ref

benchmarks.json — full dataset  ·  models.json — model summaries  ·  benchmarks.csv — CSV export

See all models →