OwnRig

DeepSeek V3 on Apple M4 Max (128GB Unified)

Apple M4 Max (128GB Unified) can run DeepSeek V3 at 3 tok/s at Q2_K, though performance is marginal. Consider a higher-end GPU for better results.

Model Size

671B

Device VRAM

128 GB

Bandwidth

546 GB/s

Quantizations Tested

1

Performance by Quantization

Each row shows DeepSeek V3 performance at a different quality level on Apple M4 Max (128GB Unified).

QuantizationSpeedTTFTFits in VRAMRatingConfidence
Q2_K3 tok/s5000ms✓ YesMarginalestimated

Notes

Q2_K

Barely fits at Q2_K (115GB) with heavy quality loss. The 128GB unified memory is just enough. Extremely slow due to model size vs bandwidth. Included to show what's technically possible — not recommended for production use.

About DeepSeek V3

DeepSeek V3 (671B) is a chat, coding, ai coding, reasoning, multi-purpose model. Massive MoE model rivaling GPT-4 class. Only ~37B parameters active per token despite 671B total. Requires multi-GPU or very large unified memory (128GB+ Apple Silicon at Q2/Q3). Not for casual home use — included for completeness and to show what the high end looks like.

View all DeepSeek V3 hardware options →

About Apple M4 Max (128GB Unified)

Apple M4 Max (128GB Unified) has 128 GB at 546 GB/s. Available in MacBook Pro 16", Mac Studio.

See all models Apple M4 Max (128GB Unified) can run →

Source: MLX performance estimates (2026-03-15)

Data last updated: 2026-03-15