NVIDIA GeForce RTX 3090
24 GB GDDR6X Β· 936 GB/s
From
$899
Estimated street price
VRAM
24 GB
Bandwidth
936 GB/s
TDP
350W
Models
21
Tier
Power
The NVIDIA GeForce RTX 3090 with 24 GB GDDR6X VRAM can handle 21 AI models across chat, coding, ai_coding. Best performance: Llama 3.2 1B Instruct at 220 tok/s (excellent). For AI coding workflows, it supports the Power AI Coding tier, running 32B coding models at good quality. Current price: approximately $899.
Source: OwnRig methodology
24 GB
936 GB/s
GDDR6X
350W
3-slot, 313mm
Builder Capability: Power AI Coding
Runs 32B coding models at good quality. Can handle coding model + embeddings concurrently.
Inference Backends
The software stacks that matter most for real-world inference on this device.
CUDA
productionPrimary high-performance backend for NVIDIA inference workloads.
Vulkan
stableFallback backend for llama.cpp and related local runtimes.
What it can run
21 models| Arcee Trinity Mini 26B | Q5_K_M | 58 tok/s | Excellent |
| Arcee Trinity Nano 6B | Q8_0 | 165 tok/s | Excellent |
| DeepSeek V3 | Q2_K | β | Not viable |
| Gemma 3 27B | Q4_K_M | 18 tok/s | Good |
| Gemma 4 26B-A4B | Q5_K_M | 213 tok/s | Excellent |
| Gemma 4 31B | Q4_K_M | 35 tok/s | Good |
| Gemma 4 E2B | Q8_0 | 141 tok/s | Excellent |
| Gemma 4 E4B | Q8_0 | 87 tok/s | Excellent |
| GigaChat Lightning 10B | Q8_0 | 66 tok/s | Good |
| Llama 3.1 8B Instruct | Q8_0 | 70 tok/s | Excellent |
| Llama 3.2 1B Instruct | Q8_0 | 220 tok/s | Excellent |
| Llama 3.2 3B Instruct | Q8_0 | 150 tok/s | Excellent |
| NVIDIA Nemotron-3-super-120B-A12B | Q2_K | 11 tok/s | Marginal |
| Phi-4 Mini | Q8_0 | 140 tok/s | Excellent |
| Qwen 2.5 Coder 32B Instruct | Q4_K_M | 18 tok/s | Good |
| Qwen3.5-122B-A10B | Q3_K_M | 11 tok/s | Marginal |
| Qwen3.5-27B | Q5_K_M | 24 tok/s | Good |
| Qwen3.5-397B (MoE) | Q2_K | β | Not viable |
| Qwen3.6-27B | Q5_K_M | 24 tok/s | Good |
| Stable Diffusion 3.5 Large | FP16 | β | Excellent |
| Whisper Large V3 Turbo | FP16 | β | Excellent |
Showing 21 of 21 entries
Featured in Builds
Buy Used
Prices and availability vary. Inspect hardware before purchasing. Some links may be affiliate links.
Frequently Asked Questions
- What AI models can NVIDIA GeForce RTX 3090 run?
- The NVIDIA GeForce RTX 3090 can run 21 AI models. Top performers include Llama 3.2 1B Instruct, Gemma 4 26B-A4B, Arcee Trinity Nano 6B. See the full compatibility table above for speeds and quality ratings.
- Is NVIDIA GeForce RTX 3090 good for AI coding?
- Yes. With 24 GB, the NVIDIA GeForce RTX 3090 supports the Power AI Coding tier: large coding models at good quality.
- How much VRAM does NVIDIA GeForce RTX 3090 have?
- The NVIDIA GeForce RTX 3090 has 24 GB of GDDR6X VRAM with 936 GB/s bandwidth.
- Can NVIDIA GeForce RTX 3090 run 70B models?
- 70B models can run on the NVIDIA GeForce RTX 3090 with CPU offloading, but performance will be reduced. Consider a GPU with 48GB+ VRAM for full-speed 70B inference.
- Is NVIDIA GeForce RTX 3090 worth it for AI?
- At $899, the NVIDIA GeForce RTX 3090 offers 24 GB VRAM and runs 21 AI models. It handles local AI inference well.
Own this GPU?
See every AI model it supports, expected performance, and how to build around it.