$3,672

| Category | Component | Price | Rationale | Buy |
|---|---|---|---|---|
| gpu | $1,799 | 24GB VRAM handles all models up to 33B at Q4. Best single-GPU performance available for consumer hardware. | ||
| cpu | $449 | 16-core/32-thread for heavy multitasking — run models, IDE, Docker, and compilation simultaneously. Fastest AM5 chip. | ||
| motherboard | $349 | X670E with Thunderbolt 4, dual x16 slots for future dual-GPU, and 10GbE for data transfer. | ||
| ram | 64GB DDR5-6000 (2x32GB) | $189 | 64GB at the DDR5 sweet spot for Zen 4. Room for 128GB upgrade. Fast enough for CPU offloading when models exceed VRAM. | |
| storage | $379 | Primary NVMe for active models + secondary for archive. 6TB total holds 100+ quantized models. | ||
| psu | $219 | 1000W for 4090 with full system headroom. Platinum efficiency reduces heat and power cost during long inference sessions. | ||
| case | Fractal DesignFractal Design Torrent | $179 | Full tower with class-leading airflow. Fits 4090 with room for dual GPUs in the future. Two 180mm front fans move massive air volume. | |
| cooler | NoctuaNoctua NH-D15 chromax.black | $109 | Top air cooler for the 7950X. Handles 170W TDP quietly. No pump failure risk vs AIO. | |
| Total | $3,672 | |||
Search links — prices and availability vary by retailer.
Prices and availability vary. Inspect hardware before purchasing.
AI models tested on this build's hardware.
| Model | Quant | Speed |
|---|---|---|
| Qwen 2.5 Coder 32B Instruct | Q4_K_M | 25 tok/s |
| QwQ 32B Preview | Q4_K_M | 24 tok/s |
| Llama 3.1 8B Instruct | Q8_0 | 95 tok/s |
| Code Llama 34B Instruct | Q4_K_M | 22 tok/s |
| Gemma 2 27B Instruct | Q4_K_M | 22 tok/s |
| Mixtral 8x7B Instruct | Q3_K_M | 35 tok/s |
| FLUX.1 Dev | FP16 | — |
| LLaVA 1.6 13B | Q5_K_M | 30 tok/s |
Add a second RTX 3090 used (~$900) via the X670E's second x16 slot for 48GB total VRAM. Or upgrade to dual RTX 4090 with motherboard swap. The 7950X and 64GB RAM handle dual-GPU without bottleneck.
Last updated: 2026-03-01.