
24 GB · 273 GB/s
$1,999
Updated 2026-03-01
The Apple M4 Pro (24GB Unified) with 24 GB unified memory can handle 11 AI models across chat, coding, ai_coding. Best performance: Llama 3.2 1B Instruct at 90 tok/s (excellent). For AI coding workflows, it supports the Power AI Coding tier — runs 32B coding models at good quality. Current price: approximately $1,999.
— OwnRig methodology, data updated 2026-03-01
Runs 32B coding models at good quality. Can handle coding model + embeddings concurrently.
| Model | Quant | Speed | Rating | Notes |
|---|---|---|---|---|
| Llama 3.1 8B Instruct | Q8_0 | 32 tok/s | Good | Unified memory means no PCIe bottleneck, but lower bandwidth (273 GB/s) limits throughput vs discrete GPUs. |
| Qwen 2.5 Coder 32B Instruct | Q4_K_M | 10 tok/s | Acceptable | Fits at Q4 (18.4GB on 24GB) but M4 Pro bandwidth (273 GB/s) limits speed significantly for this model size. Usable for non-latency-sensitive coding. |
| Gemma 3 4B | Q5_K_M | 38 tok/s | Good | Lightweight model ideal for M4 Pro. Good for quick tasks. |
| Qwen 2.5 Coder 7B Instruct | Q5_K_M | 35 tok/s | Good | Good coding performance on M4 Pro. Usable for daily development. |
| Llama 3.2 3B Instruct | Q8_0 | 60 tok/s | Excellent | 273 GB/s limits throughput but 3B still runs well. Good for quick tasks on Mac. |
| Llama 3.2 1B Instruct | Q8_0 | 90 tok/s | Excellent | 1B runs well on 273 GB/s. Good for quick inference on Mac. |
| Phi-4 Mini | Q8_0 | 55 tok/s | Excellent | 273 GB/s limits 3.8B speed. Usable for reasoning tasks on Mac. |
| Whisper Large V3 Turbo | FP16 | — | Excellent | Uses Core ML acceleration for transcription. Real-time capable on Mac. |
| Stable Diffusion 3.5 Large | FP16 | — | Acceptable | Slow due to limited bandwidth (273 GB/s) but functional. ~20s per image. |
| Gemma 3 27B | Q4_K_M | 8 tok/s | Acceptable | Q4_K_M fits with 7.7GB headroom. Low bandwidth (273 GB/s) is the bottleneck. |
| DeepSeek V3 | Q2_K | — | Not Viable | 671B MoE model requires 115GB+ at Q2_K. 24GB insufficient. Would need M4 Max 128GB. |
Prices and availability vary. Inspect hardware before purchasing.
Generation: M4. Last updated: 2026-03-01.