OwnRig

DeepSeek R1 Distill Qwen 32B on NVIDIA GeForce RTX 4070 Ti Super

NVIDIA GeForce RTX 4070 Ti Super can run DeepSeek R1 Distill Qwen 32B at 15 tok/s at Q3_K_M, though performance is acceptable. Consider a higher-end GPU for better results.

Model Size

32.5B

Device VRAM

16 GB

Bandwidth

672 GB/s

Quantizations Tested

1

Performance by Quantization

Each row shows DeepSeek R1 Distill Qwen 32B performance at a different quality level on NVIDIA GeForce RTX 4070 Ti Super.

QuantizationSpeedTTFTFits in VRAMRatingConfidence
Q3_K_M15 tok/s580ms✓ YesAcceptableestimated

Notes

Q3_K_M

Q3 required to fit in 16GB. Usable for reasoning with quality compromise.

About DeepSeek R1 Distill Qwen 32B

DeepSeek R1 Distill Qwen 32B (32.5B) is a reasoning, coding, chat model. Distilled reasoning model with strong coding and chat capabilities.

View all DeepSeek R1 Distill Qwen 32B hardware options →

About NVIDIA GeForce RTX 4070 Ti Super

NVIDIA GeForce RTX 4070 Ti Super has 16 GB at 672 GB/s. Street price: $779.

See all models NVIDIA GeForce RTX 4070 Ti Super can run →

Source: Community benchmarks and estimated performance (2026-03-01)

Data last updated: 2026-03-15