OwnRig

DeepSeek R1 Distill Qwen 32B on NVIDIA GeForce RTX 5090

NVIDIA GeForce RTX 5090 runs DeepSeek R1 Distill Qwen 32B excellently at 42 tok/s at Q5_K_M. This is a strong pairing.

Model Size

32.5B

Device VRAM

32 GB

Bandwidth

1792 GB/s

Quantizations Tested

1

Performance by Quantization

Each row shows DeepSeek R1 Distill Qwen 32B performance at a different quality level on NVIDIA GeForce RTX 5090.

QuantizationSpeedTTFTFits in VRAMRatingConfidence
Q5_K_M42 tok/s220ms✓ YesExcellentestimated

Notes

Q5_K_M

Q5 32B reasoning fits on 32GB. Best local reasoning performance.

About DeepSeek R1 Distill Qwen 32B

DeepSeek R1 Distill Qwen 32B (32.5B) is a reasoning, coding, chat model. Distilled reasoning model with strong coding and chat capabilities.

View all DeepSeek R1 Distill Qwen 32B hardware options →

About NVIDIA GeForce RTX 5090

NVIDIA GeForce RTX 5090 has 32 GB at 1792 GB/s. Street price: $2,199.

See all models NVIDIA GeForce RTX 5090 can run →

Builds with NVIDIA GeForce RTX 5090

Source: Community benchmarks and estimated performance (2026-03-01)

Data last updated: 2026-03-15