OwnRig

Qwen 2.5 Coder 32B Instruct on NVIDIA GeForce RTX 4060 8GB

NVIDIA GeForce RTX 4060 8GB cannot run Qwen 2.5 Coder 32B Instruct at any quantization level. The 8 GB of VRAM is insufficient.

Model Size

32.5B

Device VRAM

8 GB

Bandwidth

272 GB/s

Quantizations Tested

1

Performance by Quantization

Each row shows Qwen 2.5 Coder 32B Instruct performance at a different quality level on NVIDIA GeForce RTX 4060 8GB.

QuantizationSpeedTTFTFits in VRAMRatingConfidence
Q2_K✗ OffloadNot Viableestimated

Notes

Q2_K

32B coding model. Not viable on 8GB.

About Qwen 2.5 Coder 32B Instruct

Qwen 2.5 Coder 32B Instruct (32.5B) is a coding, ai coding, ai building model. The coding model that defines the builder workflow. Matches GPT-4 on HumanEval. This is what Cursor and Continue.dev users run locally when they want to eliminate API dependency. Apache 2.0 license. The cornerstone of the 'Full AI Builder' profile.

View all Qwen 2.5 Coder 32B Instruct hardware options →

About NVIDIA GeForce RTX 4060 8GB

NVIDIA GeForce RTX 4060 8GB has 8 GB at 272 GB/s. Street price: $289.

See all models NVIDIA GeForce RTX 4060 8GB can run →

Source: 32.76B model exceeds 8GB (2026-03-15)

Data last updated: 2026-03-01