Mini
Apple Mac Studio (M4 Max, 128GB)
macOS
M4 Max with 128GB unified memory (1TB SSD baseline).
Memory
128 GB
GPUs
1Γ
RAM
128 GB
Models
33
Type
Mini
Inference Memory
128 GB
Accelerator
128 GB
System RAM
128 GB
OS
macOS
What it can run
33 models| Arcee Trinity Large Thinking 400B | Q3_K_M | 1 tok/s | Not viable |
| Arcee Trinity Mini 26B | Q8_0 | 28 tok/s | Good |
| Arcee Trinity Nano 6B | Q8_0 | 118 tok/s | Excellent |
| DeepSeek R1 | Q2_K | 4 tok/s | Marginal |
| DeepSeek R1 Distill Qwen 32B | Q5_K_M | 16 tok/s | Good |
| DeepSeek V3 | Q2_K | 3 tok/s | Marginal |
| Gemma 3 27B | Q8_0 | 12 tok/s | Good |
| Gemma 4 26B-A4B | Q8_0 | 84 tok/s | Excellent |
| Gemma 4 31B | Q8_0 | 12 tok/s | Marginal |
| Gemma 4 E2B | Q8_0 | 82 tok/s | Excellent |
| Gemma 4 E4B | Q8_0 | 50 tok/s | Good |
| GigaChat Lightning 10B | Q8_0 | 72 tok/s | Excellent |
| Llama 3.1 70B Instruct | Q5_K_M | 7 tok/s | Acceptable |
| Llama 3.2 11B Vision | Q8_0 | 42 tok/s | Excellent |
| Llama 3.2 1B Instruct | Q8_0 | 150 tok/s | Excellent |
| Llama 3.2 3B Instruct | Q8_0 | 100 tok/s | Excellent |
| Llama 3.3 70B Instruct | Q4_K_M | 18 tok/s | Acceptable |
| Llama 4 Scout | Q8_0 | 4 tok/s | Marginal |
| Mistral Large 2 123B | Q4_K_M | 10 tok/s | Acceptable |
| NVIDIA Nemotron-3-super-120B-A12B | Q4_K_M | 39 tok/s | Excellent |
| Phi-4 Mini | Q8_0 | 90 tok/s | Excellent |
| Qwen 2.5 72B Instruct | Q4_K_M | 6 tok/s | Acceptable |
| Qwen 2.5 Coder 32B Instruct | Q8_0 | 15 tok/s | Good |
| Qwen3-30B-A3B | Q8_0 | 17 tok/s | Acceptable |
| Qwen3-32B Instruct | Q8_0 | 14 tok/s | Acceptable |
| Qwen3.5-122B-A10B | Q8_0 | 36 tok/s | Excellent |
| Qwen3.5-27B | Q8_0 | 16 tok/s | Excellent |
| Qwen3.5-397B (MoE) | Q2_K | 8 tok/s | Marginal |
| Qwen3.6-27B | Q8_0 | 16 tok/s | Excellent |
| Qwen3.6-35B-A3B | Q5_K_M | 17 tok/s | Acceptable |
| QwQ 32B Preview | Q8_0 | 14 tok/s | Good |
| Stable Diffusion 3.5 Large | FP16 | β | Good |
| Whisper Large V3 Turbo | FP16 | β | Excellent |
Showing 33 of 33 entries
Best Fit
Who this machine makes sense for
This machine is a buy-it-ready path for users who want predictable local AI performance without building from parts. 128 GB gives it enough headroom to matter for real model selection, not just toy workloads.
Before You Buy
What to verify first
The main check before buying is upgrade path clarity: confirm memory ceiling, storage expandability, and whether the accelerator path still matches the models you expect to run a year from now.