HP Z8 Fury G6i (2× RTX PRO 6000 Max-Q, 192 GB)
Windows · Linux
HP Z8 Fury G6i with two NVIDIA RTX PRO 6000 Blackwell Max-Q GPUs (192 GB GDDR7 total). Intel Xeon 600-series CPU, 512 GB DDR5-6400. Runs 70B models at full FP16 quality and DeepSeek R1 (671B) at Q4_K_M. The mid-range professional AI workstation.
From
$35,000
Estimated · varies by configuration
Enterprise pricing varies by configuration and region. Confirm quote and availability with HP.
View on HPMemory
192 GB
GPUs
2×
RAM
512 GB
Models
63
Type
Desktop
192 GB
2× 96 GB GDDR7
512 GB
Intel Xeon w7-3455 (24-core Granite Rapids)
Windows, Linux
Multi-GPU System
This system has 2 GPUs (192 GB total). Models that fit on a single GPU run at full speed. Larger models require cross-GPU inference — actual throughput depends on the inference engine and interconnect bandwidth.
What it can run
63 models| all-MiniLM-L6-v2 | FP16 | 2760 tok/s | Excellent |
| Arcee Trinity Mini 26B | Q8_0 | 75 tok/s | Excellent |
| Arcee Trinity Nano 6B | Q8_0 | 318 tok/s | Excellent |
| Code Llama 34B Instruct | Q5_K_M | 44 tok/s | Good |
| Codestral 22B | Q5_K_M | 67 tok/s | Good |
| Command R 35B | Q8_0 | 26 tok/s | Acceptable |
| DeepSeek Coder V2 Lite 16B | Q8_0 | 58 tok/s | Good |
| DeepSeek R1 | Q2_K | – | Not viable |
| DeepSeek R1 Distill Qwen 32B | Q8_0 | 28 tok/s | Acceptable |
| DeepSeek R1 Distill Qwen 7B | Q8_0 | 119 tok/s | Excellent |
| DeepSeek V3 | Q2_K | – | Not viable |
| FLUX.1 Dev | FP16 | – | Excellent |
| Gemma 2 27B Instruct | Q5_K_M | 54 tok/s | Good |
| Gemma 2 9B Instruct | Q8_0 | 98 tok/s | Excellent |
| Gemma 3 12B | Q8_0 | 74 tok/s | Good |
| Gemma 3 27B | Q8_0 | 33 tok/s | Good |
| Gemma 3 4B | Q8_0 | 210 tok/s | Excellent |
| Gemma 4 26B-A4B | Q8_0 | 279 tok/s | Excellent |
| Gemma 4 31B | Q8_0 | 41 tok/s | Good |
| Gemma 4 E2B | Q8_0 | 271 tok/s | Excellent |
| Gemma 4 E4B | Q8_0 | 168 tok/s | Excellent |
| GigaChat Lightning 10B | Q8_0 | 299 tok/s | Excellent |
| InternLM 2.5 7B Chat | Q8_0 | 117 tok/s | Excellent |
| Llama 3.1 70B Instruct | Q5_K_M | 21 tok/s | Acceptable |
| Llama 3.1 8B Instruct | Q8_0 | 112 tok/s | Excellent |
| Llama 3.2 11B Vision | Q8_0 | 82 tok/s | Excellent |
| Llama 3.2 1B Instruct | Q8_0 | 433 tok/s | Excellent |
| Llama 3.2 3B Instruct | Q8_0 | 239 tok/s | Excellent |
| Llama 3.3 70B Instruct | Q8_0 | 13 tok/s | Acceptable |
| Llama 4 Scout | Q5_K_M | 87 tok/s | Excellent |
| LLaVA 1.6 13B | Q5_K_M | 114 tok/s | Excellent |
| Mistral 7B Instruct v0.3 | Q8_0 | 125 tok/s | Excellent |
| Mistral Large 2 123B | Q5_K_M | 12 tok/s | Acceptable |
| Mistral Small 24B Instruct | Q8_0 | 38 tok/s | Good |
| Mixtral 8x7B Instruct | Q5_K_M | 115 tok/s | Excellent |
| nomic-embed-text v1.5 | FP16 | 1840 tok/s | Excellent |
| NVIDIA Nemotron-3-super-120B-A12B | Q4_K_M | 145 tok/s | Excellent |
| Phi-3 Medium 14B Instruct | Q8_0 | 64 tok/s | Good |
| Phi-3 Mini 3.8B Instruct | Q8_0 | 236 tok/s | Excellent |
| Phi-4 14B | Q8_0 | 62 tok/s | Good |
| Phi-4 Mini | Q8_0 | 236 tok/s | Excellent |
| Qwen 2.5 14B Instruct | Q8_0 | 61 tok/s | Good |
| Qwen 2.5 72B Instruct | Q4_K_M | 24 tok/s | Acceptable |
| Qwen 2.5 7B Instruct | Q8_0 | 119 tok/s | Excellent |
| Qwen 2.5 Coder 32B Instruct | Q5_K_M | 46 tok/s | Good |
| Qwen 2.5 Coder 7B Instruct | Q8_0 | 119 tok/s | Excellent |
| Qwen3-14B Instruct | Q8_0 | 64 tok/s | Good |
| Qwen3-30B-A3B | Q8_0 | 256 tok/s | Excellent |
| Qwen3-32B Instruct | Q8_0 | 29 tok/s | Acceptable |
| Qwen3-8B Instruct | Q8_0 | 110 tok/s | Excellent |
| Qwen3.5-122B-A10B | Q8_0 | 90 tok/s | Excellent |
| Qwen3.5-27B | Q8_0 | 33 tok/s | Good |
| Qwen3.5-397B (MoE) | Q2_K | – | Not viable |
| Qwen3.6-27B | Q8_0 | 33 tok/s | Good |
| Qwen3.6-35B-A3B | Q5_K_M | 256 tok/s | Excellent |
| QwQ 32B Preview | Q5_K_M | 46 tok/s | Good |
| Stable Diffusion 3 Medium | FP16 | – | Excellent |
| Stable Diffusion 3.5 Large | FP16 | – | Excellent |
| Stable Diffusion XL 1.0 | FP16 | – | Excellent |
| StarCoder 2 15B | Q8_0 | 58 tok/s | Good |
| Whisper Large V3 | FP16 | – | Excellent |
| Whisper Large V3 Turbo | FP16 | – | Excellent |
| Yi 1.5 34B Chat | Q8_0 | 27 tok/s | Acceptable |
Showing 63 of 63 entries
Who this machine makes sense for
This machine is aimed at team, lab, or enterprise buyers who want a supported system instead of assembling a tower. 192 GB makes it viable for serious local workloads without a DIY build process.
What to verify first
The main question is not whether the machine works, but whether the price premium is justified by warranty, support, and deployment simplicity versus an equivalent custom build.