NVIDIA
Desktop GPU
Desktop GPU

NVIDIA GeForce RTX 4090

24 GB GDDR6X Β· 1008 GB/s

From

$1,799

Estimated street price

VRAM

24 GB

Bandwidth

1008 GB/s

TDP

450W

Models

59

Tier

Power

The NVIDIA GeForce RTX 4090 with 24 GB GDDR6X VRAM can handle 58 AI models across embedding, ai_building, coding. Best performance: Llama 3.2 1B Instruct at 250 tok/s (excellent). For AI coding workflows, it supports the Power AI Coding tier, running 32B coding models at good quality. Current price: approximately $1,799.

Source: OwnRig methodology

VRAM

24 GB

Bandwidth

1008 GB/s

Memory Type

GDDR6X

TDP

450W

Form Factor

3-slot, 336mm

Builder Capability: Power AI Coding

Runs 32B coding models at good quality. Can handle coding model + embeddings concurrently.

Software

Inference Backends

The software stacks that matter most for real-world inference on this device.

CUDA

production

Primary high-performance backend for NVIDIA inference workloads.

Vulkan

stable

Fallback backend for llama.cpp and related local runtimes.

What it can run

58 models
all-MiniLM-L6-v2FP16–Excellent
Arcee Trinity Mini 26BQ5_K_M62 tok/sExcellent
Arcee Trinity Nano 6BQ8_0178 tok/sExcellent
Code Llama 34B InstructQ4_K_M22 tok/sGood
Codestral 22BQ5_K_M35 tok/sExcellent
DeepSeek Coder V2 Lite 16BQ5_K_M55 tok/sExcellent
DeepSeek R1Q2_K1 tok/sNot viable
DeepSeek R1 Distill Qwen 32BQ4_K_M24 tok/sGood
DeepSeek R1 Distill Qwen 7BQ4_K_M92 tok/sExcellent
DeepSeek V3Q2_K–Not viable
FLUX.1 DevFP16–Excellent
Gemma 2 27B InstructQ4_K_M22 tok/sGood
Gemma 2 9B InstructQ8_080 tok/sExcellent
Gemma 3 12BQ5_K_M75 tok/sExcellent
Gemma 3 27BQ4_K_M22 tok/sGood
Gemma 4 26B-A4BQ5_K_M229 tok/sExcellent
Gemma 4 31BQ4_K_M38 tok/sGood
Gemma 4 E2BQ8_0152 tok/sExcellent
Gemma 4 E4BQ8_094 tok/sExcellent
GigaChat Lightning 10BQ8_0110 tok/sGood
InternLM 2.5 7B ChatQ8_088 tok/sExcellent
Llama 3.1 70B InstructQ3_K_M5 tok/sMarginal
Llama 3.1 8B InstructQ8_095 tok/sExcellent
Llama 3.2 11B VisionQ8_095 tok/sExcellent
Llama 3.2 1B InstructQ8_0250 tok/sExcellent
Llama 3.2 3B InstructQ8_0170 tok/sExcellent
Llama 3.3 70B InstructQ3_K_M6 tok/sMarginal
LLaVA 1.6 13BQ5_K_M30 tok/sGood
Mistral 7B Instruct v0.3Q8_090 tok/sExcellent
Mistral Large 2 123BQ2_K3 tok/sMarginal
Mistral Small 24B InstructQ5_K_M32 tok/sGood
Mixtral 8x7B InstructQ3_K_M35 tok/sGood
nomic-embed-text v1.5FP16–Excellent
NVIDIA Nemotron-3-super-120B-A12BQ2_K18 tok/sMarginal
Phi-3 Medium 14B InstructQ8_055 tok/sExcellent
Phi-3 Mini 3.8B InstructQ8_0130 tok/sExcellent
Phi-4 14BQ5_K_M58 tok/sExcellent
Phi-4 MiniQ8_0160 tok/sExcellent
Qwen 2.5 14B InstructQ5_K_M55 tok/sExcellent
Qwen 2.5 7B InstructQ8_088 tok/sExcellent
Qwen 2.5 Coder 32B InstructQ4_K_M25 tok/sGood
Qwen 2.5 Coder 7B InstructQ8_090 tok/sExcellent
Qwen3-14B InstructQ8_041 tok/sGood
Qwen3-30B-A3BQ5_K_M25 tok/sGood
Qwen3-32B InstructQ5_K_M25 tok/sGood
Qwen3-32B InstructQ4_K_M30 tok/sGood
Qwen3-8B InstructQ8_083 tok/sExcellent
Qwen3.5-122B-A10BQ3_K_M19 tok/sMarginal
Qwen3.5-27BQ5_K_M40 tok/sGood
Qwen3.5-397B (MoE)Q2_K–Not viable
Qwen3.6-27BQ5_K_M40 tok/sGood
Qwen3.6-35B-A3BQ4_K_M25 tok/sGood
QwQ 32B PreviewQ4_K_M24 tok/sGood
Stable Diffusion 3 MediumFP16–Excellent
Stable Diffusion 3.5 LargeFP16–Excellent
Stable Diffusion XL 1.0FP16–Excellent
StarCoder 2 15BQ8_050 tok/sExcellent
Whisper Large V3FP16–Excellent
Whisper Large V3 TurboFP16–Excellent

Showing 59 of 59 entries

Ready to Buy

Available in these Machines

Curated Builds

Featured in Builds

Buy Used

Prices and availability vary. Inspect hardware before purchasing. Some links may be affiliate links.

FAQ

Frequently Asked Questions

What AI models can NVIDIA GeForce RTX 4090 run?
The NVIDIA GeForce RTX 4090 can run 58 AI models. Top performers include Llama 3.2 1B Instruct, Gemma 4 26B-A4B, Arcee Trinity Nano 6B. See the full compatibility table above for speeds and quality ratings.
Is NVIDIA GeForce RTX 4090 good for AI coding?
Yes. With 24 GB, the NVIDIA GeForce RTX 4090 supports the Power AI Coding tier: large coding models at good quality.
How much VRAM does NVIDIA GeForce RTX 4090 have?
The NVIDIA GeForce RTX 4090 has 24 GB of GDDR6X VRAM with 1008 GB/s bandwidth.
Can NVIDIA GeForce RTX 4090 run 70B models?
70B models can run on the NVIDIA GeForce RTX 4090 with CPU offloading, but performance will be reduced. Consider a GPU with 48GB+ VRAM for full-speed 70B inference.
Is NVIDIA GeForce RTX 4090 worth it for AI?
At $1,799, the NVIDIA GeForce RTX 4090 offers 24 GB VRAM and runs 58 AI models. It handles local AI inference well.

Own this GPU?

See every AI model it supports, expected performance, and how to build around it.

Check my rig

Related Guides

Buying Guide

How to Choose Your First AI GPU

A data-backed buying guide to choosing the right GPU for running AI models locally. VRAM explained, budget tiers compared, and specific GPU recommendations with compatible models.

Tutorial

The Complete Guide to Running LLMs Locally

Run large language models locally: hardware needs, Ollama and llama.cpp, model picks by use case, and quantization.

Explainer

VRAM: The Only Spec That Matters for AI

VRAM for local AI: what it is, why models need it, how quantization cuts requirements, and a VRAM table for major models.

Roundup

Best AI Hardware for Developers in 2026

Best AI GPUs in 2026: RTX 4060 Ti to RTX 5090, Apple Silicon M4 Max. Picks by budget, use case, and dev workflow. Complete build specs included.

Explainer

Mac vs Windows for Local AI: A Beginner's Honest Take

No tribal wars: when Apple Silicon is the easy path, when a Windows desktop with an NVIDIA GPU wins, what unified memory means, and how to pick without drowning in forum fights.

Explainer

How we test: OwnRig's benchmark methodology

How OwnRig measures tokens per second, rates model compatibility, and keeps hardware data current. Our methodology, tools, and known limitations.

Tutorial

Running Gemma 4 locally: which GPU you actually need

Gemma 4 VRAM requirements for every variant: E2B, E4B, 26B-A4B, and 31B. Which GPUs can run each, what quantization to use, and the honest call on RTX 4060 vs RTX 4090.

Buying Guide

Best GPUs for Stable Diffusion, Flux, and SD3 in 2026

GPU requirements for SDXL, Stable Diffusion 3 Medium, SD 3.5 Large, and FLUX.1 Dev. Per-GPU performance verdicts for RTX 4060 Ti, RTX 4070, RTX 4090, and Apple Silicon.