NVIDIA
Desktop GPU
Desktop GPU

NVIDIA GeForce RTX 4060 8GB

8 GB GDDR6 Β· 272 GB/s

From

$289

Estimated street price

VRAM

8 GB

Bandwidth

272 GB/s

TDP

115W

Models

52

Tier

Limited

The NVIDIA GeForce RTX 4060 8GB with 8 GB GDDR6 VRAM can handle 52 AI models across embedding, ai_building, coding. Best performance: all-MiniLM-L6-v2 at 8500 tok/s (excellent). Current price: approximately $289.

Source: OwnRig methodology

VRAM

8 GB

Bandwidth

272 GB/s

Memory Type

GDDR6

TDP

115W

Form Factor

2-slot, 240mm

Builder Capability: Limited

Insufficient VRAM for most AI coding workflows.

Software

Inference Backends

The software stacks that matter most for real-world inference on this device.

CUDA

production

Primary high-performance backend for NVIDIA inference workloads.

Vulkan

stable

Fallback backend for llama.cpp and related local runtimes.

What it can run

52 models
all-MiniLM-L6-v2FP168500 tok/sExcellent
Arcee Trinity Nano 6BQ8_048 tok/sExcellent
Code Llama 34B InstructQ2_K–Not viable
Codestral 22BQ3_K_M–Not viable
Command R 35BQ2_K–Not viable
DeepSeek Coder V2 Lite 16BQ3_K_M45 tok/sGood
DeepSeek R1 Distill Qwen 32BQ2_K–Not viable
DeepSeek R1 Distill Qwen 7BQ4_K_M32 tok/sGood
DeepSeek V3Q2_K–Not viable
FLUX.1 DevQ4_K_M–Marginal
Gemma 2 27B InstructQ3_K_M–Not viable
Gemma 2 9B InstructQ4_K_M28 tok/sGood
Gemma 3 12BQ3_K_M18 tok/sMarginal
Gemma 3 27BQ3_K_M–Not viable
Gemma 3 4BQ5_K_M55 tok/sExcellent
Gemma 4 E2BQ8_041 tok/sGood
Gemma 4 E4BQ6_K32 tok/sGood
GigaChat Lightning 10BQ4_K_M64 tok/sAcceptable
InternLM 2.5 7B ChatQ4_K_M30 tok/sGood
Llama 3.1 70B InstructQ2_K–Not viable
Llama 3.1 8B InstructQ4_K_M32 tok/sGood
Llama 3.2 1B InstructQ8_095 tok/sExcellent
Llama 3.2 3B InstructQ8_065 tok/sExcellent
Llama 3.3 70B InstructQ2_K–Not viable
LLaVA 1.6 13BQ3_K_M22 tok/sMarginal
Mistral 7B Instruct v0.3Q4_K_M31 tok/sGood
Mistral Small 24B InstructQ3_K_M–Not viable
Mixtral 8x7B InstructQ4_K_M–Not viable
nomic-embed-text v1.5Q8_04200 tok/sExcellent
NVIDIA Nemotron-3-super-120B-A12BQ2_K–Not viable
Phi-3 Medium 14B InstructQ3_K_M20 tok/sMarginal
Phi-3 Mini 3.8B InstructQ5_K_M52 tok/sExcellent
Phi-4 14BQ3_K_M19 tok/sMarginal
Phi-4 MiniQ5_K_M55 tok/sExcellent
Qwen 2.5 14B InstructQ3_K_M17 tok/sMarginal
Qwen 2.5 72B InstructQ2_K–Not viable
Qwen 2.5 7B InstructQ4_K_M30 tok/sGood
Qwen 2.5 Coder 32B InstructQ2_K–Not viable
Qwen 2.5 Coder 7B InstructQ4_K_M31 tok/sGood
Qwen3-14B InstructQ3_K_M18 tok/sAcceptable
Qwen3-8B InstructQ5_K_M24 tok/sAcceptable
Qwen3.5-27BQ3_K_M–Not viable
Qwen3.5-397B (MoE)Q2_K–Not viable
Qwen3.6-27BQ3_K_M–Not viable
QwQ 32B PreviewQ2_K–Not viable
Stable Diffusion 3 MediumFP16–Good
Stable Diffusion 3.5 LargeQ8_0–Not viable
Stable Diffusion XL 1.0FP16–Good
StarCoder 2 15BQ3_K_M16 tok/sMarginal
Whisper Large V3Q5_K_M–Excellent
Whisper Large V3 TurboFP16–Excellent
Yi 1.5 34B ChatQ2_K–Not viable

Showing 52 of 52 entries

Buy Used

Prices and availability vary. Inspect hardware before purchasing. Some links may be affiliate links.

FAQ

Frequently Asked Questions

What AI models can NVIDIA GeForce RTX 4060 8GB run?
The NVIDIA GeForce RTX 4060 8GB can run 52 AI models. Top performers include all-MiniLM-L6-v2, nomic-embed-text v1.5, Llama 3.2 1B Instruct. See the full compatibility table above for speeds and quality ratings.
Is NVIDIA GeForce RTX 4060 8GB good for AI coding?
With 8 GB, the NVIDIA GeForce RTX 4060 8GB has limited VRAM for AI coding workflows.
How much VRAM does NVIDIA GeForce RTX 4060 8GB have?
The NVIDIA GeForce RTX 4060 8GB has 8 GB of GDDR6 VRAM with 272 GB/s bandwidth.
Can NVIDIA GeForce RTX 4060 8GB run 70B models?
70B models can run on the NVIDIA GeForce RTX 4060 8GB with CPU offloading, but performance will be reduced. Consider a GPU with 48GB+ VRAM for full-speed 70B inference.
Is NVIDIA GeForce RTX 4060 8GB worth it for AI?
At $289, the NVIDIA GeForce RTX 4060 8GB offers 8 GB VRAM and runs 52 AI models. It works for smaller models and experimentation.

Own this GPU?

See every AI model it supports, expected performance, and how to build around it.

Check my rig