← All benchmarks

Apple Silicon RAM Calculator

Can your Mac run this model? Select a model and quantization to see which chips have enough RAM — and how fast they'll run it.

Check a model

Quick reference: RAM by model size

Approximate VRAM/unified memory needed at each quantization. Add ~2–4 GB for OS and runtime overhead.

Model size Q4_K_M Q5_K_M Q6_K Q8_0 Minimum Mac
1B ~0.8 GB ~1.0 GB ~1.2 GB ~1.5 GB 8 GB (any M-series)
3B ~2.0 GB ~2.5 GB ~3.0 GB ~3.5 GB 8 GB (any M-series)
7B–8B ~4.5 GB ~5.5 GB ~6.5 GB ~8.5 GB 16 GB (M-series base)
14B ~9 GB ~11 GB ~13 GB ~16 GB 24 GB (M Pro+)
32B ~20 GB ~24 GB ~29 GB ~35 GB 36–48 GB (M Max)
70B ~43 GB ~53 GB ~63 GB ~75 GB 64 GB (M Max 64 GB+)
105B ~65 GB ~79 GB ~94 GB ~112 GB 128 GB (M Max 128 GB)
235B (MoE) ~130–140 GB ~160 GB ~190 GB ~240 GB 192 GB (M Ultra)
405B ~245 GB ~300 GB 512 GB (M3 Ultra)

MoE (Mixture of Experts) models like Qwen 3 235B A22B use fewer active parameters during inference — they need less RAM than their total parameter count suggests. A 235B MoE model at Q4 needs ~130–140 GB, not ~145 GB.

Apple Silicon RAM tiers

Chip RAM options Largest model at Q4_K_M Best for
M4, M3, M2, M1 (base) 8–32 GB 8B (16 GB) · 14B (24 GB) 7B–8B daily use
M4 Pro, M3 Pro, M2 Pro 24–64 GB 14B (24 GB) · 32B (48+ GB) 14B daily, occasional 32B
M4 Max, M3 Max, M2 Max 36–128 GB 32B (48 GB) · 70B (128 GB) 32B–70B inference
M2 Ultra, M3 Ultra 64–512 GB 235B (192+ GB) · 405B (512 GB) Maximum model size

About quantization and quality

Quantization Size vs F32 Quality Speed Recommended use
Q2_K ~25% Noticeably degraded Fastest When RAM is severely limited
Q3_K_M ~35% Somewhat degraded Very fast When RAM is tight
Q4_K_M ~45% Good — minimal loss Fast Best daily driver
Q5_K_M ~55% Very good Moderately fast Quality-focused use
Q6_K ~65% Excellent — near full Moderate High-quality tasks
Q8_0 ~83% Near-lossless Slower Benchmarking, max quality

Related tools and guides

benchmarks.json — full dataset  ·  chips.json — chip summaries  ·  benchmarks.csv — CSV export