NVIDIA
Desktop GPU
Desktop GPU

NVIDIA GeForce RTX 3060 12GB

12 GB GDDR6 Β· 360 GB/s

From

$269

Estimated street price

VRAM

12 GB

Bandwidth

360 GB/s

TDP

170W

Models

35

Tier

Starter

The NVIDIA GeForce RTX 3060 12GB with 12 GB GDDR6 VRAM can handle 35 AI models across embedding, ai_building, coding. Best performance: Llama 3.2 1B Instruct at 140 tok/s (excellent). For AI coding workflows, it supports the Starter AI Coding tier, good for 7–8B models. Current price: approximately $269.

Source: OwnRig methodology

VRAM

12 GB

Bandwidth

360 GB/s

Memory Type

GDDR6

TDP

170W

Form Factor

2-slot, 242mm

Builder Capability: Starter AI Coding

Runs 7-8B models comfortably. Good for basic local code completion and small model experiments.

Software

Inference Backends

The software stacks that matter most for real-world inference on this device.

CUDA

production

Primary high-performance backend for NVIDIA inference workloads.

Vulkan

stable

Fallback backend for llama.cpp and related local runtimes.

What it can run

35 models
all-MiniLM-L6-v2FP16–Excellent
Arcee Trinity Mini 26BQ3_K_M5 tok/sNot viable
Arcee Trinity Nano 6BQ8_064 tok/sExcellent
DeepSeek Coder V2 Lite 16BQ4_K_M40 tok/sGood
DeepSeek R1 Distill Qwen 7BQ4_K_M38 tok/sGood
DeepSeek V3Q2_K–Not viable
Gemma 2 9B InstructQ5_K_M30 tok/sGood
Gemma 3 12BQ4_K_M32 tok/sGood
Gemma 3 27BQ3_K_M–Not viable
Gemma 3 4BQ5_K_M55 tok/sExcellent
Gemma 4 26B-A4BQ3_K_M8 tok/sNot viable
Gemma 4 E2BQ8_054 tok/sGood
Gemma 4 E4BQ8_033 tok/sGood
GigaChat Lightning 10BQ4_K_M56 tok/sAcceptable
InternLM 2.5 7B ChatQ5_K_M35 tok/sGood
Llama 3.1 8B InstructQ5_K_M35 tok/sGood
Llama 3.2 11B VisionQ4_K_M22 tok/sAcceptable
Llama 3.2 1B InstructQ8_0140 tok/sExcellent
Llama 3.2 3B InstructQ8_090 tok/sExcellent
Mistral 7B Instruct v0.3Q5_K_M33 tok/sGood
nomic-embed-text v1.5FP16–Excellent
NVIDIA Nemotron-3-super-120B-A12BQ2_K–Not viable
Phi-3 Mini 3.8B InstructQ8_060 tok/sExcellent
Phi-4 MiniQ8_080 tok/sExcellent
Qwen 2.5 7B InstructQ5_K_M33 tok/sGood
Qwen 2.5 Coder 7B InstructQ5_K_M36 tok/sGood
Qwen3-8B InstructQ8_020 tok/sAcceptable
Qwen3.5-122B-A10BQ3_K_M–Not viable
Qwen3.5-27BQ3_K_M5 tok/sMarginal
Qwen3.5-397B (MoE)Q2_K–Not viable
Qwen3.6-27BQ3_K_M–Not viable
Stable Diffusion 3.5 LargeQ8_0–Good
Stable Diffusion XL 1.0FP16–Good
Whisper Large V3FP16–Excellent
Whisper Large V3 TurboFP16–Excellent

Showing 35 of 35 entries

Curated Builds

Featured in Builds

Buy Used

Prices and availability vary. Inspect hardware before purchasing. Some links may be affiliate links.

FAQ

Frequently Asked Questions

What AI models can NVIDIA GeForce RTX 3060 12GB run?
The NVIDIA GeForce RTX 3060 12GB can run 35 AI models. Top performers include Llama 3.2 1B Instruct, Llama 3.2 3B Instruct, Phi-4 Mini. See the full compatibility table above for speeds and quality ratings.
Is NVIDIA GeForce RTX 3060 12GB good for AI coding?
With 12 GB, the NVIDIA GeForce RTX 3060 12GB runs 7-8B coding models at the Starter tier. Good for basic code completion.
How much VRAM does NVIDIA GeForce RTX 3060 12GB have?
The NVIDIA GeForce RTX 3060 12GB has 12 GB of GDDR6 VRAM with 360 GB/s bandwidth.
Can NVIDIA GeForce RTX 3060 12GB run 70B models?
70B models can run on the NVIDIA GeForce RTX 3060 12GB with CPU offloading, but performance will be reduced. Consider a GPU with 48GB+ VRAM for full-speed 70B inference.
Is NVIDIA GeForce RTX 3060 12GB worth it for AI?
At $269, the NVIDIA GeForce RTX 3060 12GB offers 12 GB VRAM and runs 35 AI models. It works for smaller models and experimentation.

Own this GPU?

See every AI model it supports, expected performance, and how to build around it.

Check my rig