Apple M3 Pro (18GB Unified)
18 GB Unified Β· 150 GB/s
From
$1,799
Estimated street price
VRAM
18 GB
Bandwidth
150 GB/s
TDP
30W
Models
59
Tier
Capable
The Apple M3 Pro (18GB Unified) with 18 GB unified memory can handle 59 AI models across embedding, ai_building, coding. Best performance: all-MiniLM-L6-v2 at 1200 tok/s (good). For AI coding workflows, it supports the Capable AI Coding tier, handling single model workflows well. Current price: approximately $1,799.
Source: OwnRig methodology
18 GB
150 GB/s
Unified
30W
14
MacBook Pro 14", MacBook Pro 16"
Builder Capability: Capable AI Coding
Runs 16-22B coding models comfortably, or 32B at reduced quality. Handles single model workflows well.
Inference Backends
The software stacks that matter most for real-world inference on this device.
Metal
productionPrimary Apple Silicon backend across MLX and llama.cpp workloads.
What it can run
59 models| all-MiniLM-L6-v2 | FP16 | 1200 tok/s | Good |
| Arcee Trinity Mini 26B | Q4_K_M | 8 tok/s | Not viable |
| Arcee Trinity Nano 6B | Q8_0 | 32 tok/s | Good |
| Code Llama 34B Instruct | Q3_K_M | β | Not viable |
| Codestral 22B | Q3_K_M | β | Not viable |
| Command R 35B | Q3_K_M | β | Not viable |
| DeepSeek Coder V2 Lite 16B | Q4_K_M | 20 tok/s | Good |
| DeepSeek R1 Distill Qwen 32B | Q3_K_M | β | Not viable |
| DeepSeek R1 Distill Qwen 7B | Q4_K_M | 14 tok/s | Acceptable |
| DeepSeek V3 | Q2_K | β | Not viable |
| FLUX.1 Dev | Q4_K_M | β | Not viable |
| Gemma 2 27B Instruct | Q4_K_M | β | Not viable |
| Gemma 2 9B Instruct | Q4_K_M | 13 tok/s | Acceptable |
| Gemma 3 12B | Q3_K_M | 5 tok/s | Marginal |
| Gemma 3 27B | Q3_K_M | 3 tok/s | Marginal |
| Gemma 3 4B | Q4_K_M | 22 tok/s | Acceptable |
| Gemma 4 26B-A4B | Q3_K_M | 51 tok/s | Good |
| Gemma 4 31B | Q3_K_M | 7 tok/s | Marginal |
| Gemma 4 E2B | Q8_0 | 22 tok/s | Acceptable |
| Gemma 4 E4B | Q8_0 | 14 tok/s | Marginal |
| GigaChat Lightning 10B | Q8_0 | 38 tok/s | Acceptable |
| InternLM 2.5 7B Chat | Q4_K_M | 15 tok/s | Acceptable |
| Llama 3.1 70B Instruct | Q2_K | β | Not viable |
| Llama 3.1 8B Instruct | Q4_K_M | 15 tok/s | Acceptable |
| Llama 3.2 1B Instruct | Q8_0 | 45 tok/s | Good |
| Llama 3.2 3B Instruct | Q8_0 | 35 tok/s | Good |
| Llama 3.3 70B Instruct | Q2_K | β | Not viable |
| LLaVA 1.6 13B | Q4_K_M | 8 tok/s | Acceptable |
| Mistral 7B Instruct v0.3 | Q4_K_M | 14 tok/s | Acceptable |
| Mistral Small 24B Instruct | Q3_K_M | β | Not viable |
| Mixtral 8x7B Instruct | Q2_K | β | Not viable |
| nomic-embed-text v1.5 | Q8_0 | 600 tok/s | Good |
| NVIDIA Nemotron-3-super-120B-A12B | Q2_K | β | Not viable |
| Phi-3 Medium 14B Instruct | Q3_K_M | 6 tok/s | Marginal |
| Phi-3 Mini 3.8B Instruct | Q8_0 | 32 tok/s | Good |
| Phi-4 14B | Q3_K_M | 5 tok/s | Marginal |
| Phi-4 Mini | Q8_0 | 30 tok/s | Good |
| Qwen 2.5 14B Instruct | Q3_K_M | 5 tok/s | Marginal |
| Qwen 2.5 72B Instruct | Q2_K | β | Not viable |
| Qwen 2.5 7B Instruct | Q4_K_M | 16 tok/s | Acceptable |
| Qwen 2.5 Coder 32B Instruct | Q3_K_M | β | Not viable |
| Qwen 2.5 Coder 7B Instruct | Q4_K_M | 15 tok/s | Acceptable |
| Qwen3-14B Instruct | Q8_0 | 2 tok/s | Marginal |
| Qwen3-30B-A3B | Q4_K_M | 3 tok/s | Marginal |
| Qwen3-32B Instruct | Q3_K_M | 2 tok/s | Marginal |
| Qwen3-8B Instruct | Q8_0 | 8 tok/s | Marginal |
| Qwen3.5-122B-A10B | Q3_K_M | β | Not viable |
| Qwen3.5-27B | Q4_K_M | 16 tok/s | Acceptable |
| Qwen3.5-397B (MoE) | Q2_K | β | Not viable |
| Qwen3.6-27B | Q4_K_M | 16 tok/s | Acceptable |
| Qwen3.6-35B-A3B | Q3_K_M | 3 tok/s | Marginal |
| QwQ 32B Preview | Q3_K_M | β | Not viable |
| Stable Diffusion 3 Medium | FP16 | β | Good |
| Stable Diffusion 3.5 Large | FP16 | β | Marginal |
| Stable Diffusion XL 1.0 | FP16 | β | Good |
| StarCoder 2 15B | Q3_K_M | 4 tok/s | Marginal |
| Whisper Large V3 | Q5_K_M | β | Good |
| Whisper Large V3 Turbo | FP16 | β | Good |
| Yi 1.5 34B Chat | Q3_K_M | β | Not viable |
Showing 59 of 59 entries
Buy Used Mac
Prices and availability vary. Inspect hardware before purchasing. Some links may be affiliate links.
Frequently Asked Questions
- What AI models can Apple M3 Pro (18GB Unified) run?
- The Apple M3 Pro (18GB Unified) can run 59 AI models. Top performers include all-MiniLM-L6-v2, nomic-embed-text v1.5, Gemma 4 26B-A4B. See the full compatibility table above for speeds and quality ratings.
- Is Apple M3 Pro (18GB Unified) good for AI coding?
- Yes. With 18 GB, the Apple M3 Pro (18GB Unified) handles single-model coding workflows well at the Capable tier.
- How much VRAM does Apple M3 Pro (18GB Unified) have?
- The Apple M3 Pro (18GB Unified) has 18 GB of unified memory with 150 GB/s bandwidth.
- Can Apple M3 Pro (18GB Unified) run 70B models?
- 70B models can run on the Apple M3 Pro (18GB Unified) with CPU offloading, but performance will be reduced. Consider a GPU with 48GB+ VRAM for full-speed 70B inference.
- Is Apple M3 Pro (18GB Unified) worth it for AI?
- At $1,799, the Apple M3 Pro (18GB Unified) offers 18 GB VRAM and runs 59 AI models. It works for smaller models and experimentation.
Own this GPU?
See every AI model it supports, expected performance, and how to build around it.