ModelsBuildsConfigureGuidesMachinesMy Rig
Build My Rig
Build My Rig
Loading

Build it locally. We'll sort the hardware.

ModelsGPUsBuildsMachinesWorkflowsRecommendConfigureCompareGuidesAboutOpen Data
Dark mode active

New models and GPUs, straight to your inbox

Hardware updates only. Unsubscribe anytime. Privacy

Ask AI for a summary about OwnRig

Trademark Notice: NVIDIA, GeForce, and RTX are trademarks of NVIDIA Corporation. AMD and Radeon are trademarks of Advanced Micro Devices, Inc. Apple, Mac, and Apple Silicon are trademarks of Apple Inc. All other product names, logos, and brands are property of their respective owners. AI model names (Llama, Gemma, Mistral, Qwen, etc.) are trademarks of their respective creators. Use of these names and logos is for identification purposes only and does not imply endorsement.

Independence & Affiliates: OwnRig is an independent resource. We are not affiliated with, endorsed by, or sponsored by any hardware manufacturer, AI model provider, or retailer. Our recommendations are based on technical merit and community benchmarks. Some links on this site are affiliate links. If you purchase through them, we may earn a small commission at no extra cost to you. This does not influence our recommendations.

Data Accuracy: Performance figures are estimates based on community benchmarks and may vary by configuration, driver version, and software. Prices are approximate US retail as of March 2026 and may vary by retailer and region. VRAM requirements are calculated from model parameters with overhead estimates. Always verify specifications with manufacturer documentation before purchasing.

Β© 2026 OwnRig. All rights reserved.

Privacy
NVIDIA
  1. Home
  2. /GPUs
  3. /NVIDIA GeForce RTX 5060 8GB
NVIDIA
Desktop GPU
Desktop GPU

NVIDIA GeForce RTX 5060 8GB

8 GB GDDR7 Β· 448 GB/s

From

$299

Estimated street price

VRAM

8 GB

Bandwidth

448 GB/s

TDP

145W

Models

52

Tier

Limited

The NVIDIA GeForce RTX 5060 8GB with 8 GB GDDR7 VRAM can handle 52 AI models across embedding, ai_building, coding. Best performance: all-MiniLM-L6-v2 at 9775 tok/s (excellent). Current price: approximately $299.

Source: OwnRig methodology

VRAM

8 GB

Bandwidth

448 GB/s

Memory Type

GDDR7

TDP

145W

Form Factor

2-slot, 241mm

Builder Capability: Limited

Insufficient VRAM for most AI coding workflows.

Software

Inference Backends

The software stacks that matter most for real-world inference on this device.

CUDA

production

Primary high-performance backend for NVIDIA inference workloads.

Vulkan

stable

Fallback backend for llama.cpp and related local runtimes.

What it can run

52 models
all-MiniLM-L6-v2FP169775 tok/sExcellent
Arcee Trinity Nano 6BQ8_055 tok/sExcellent
Code Llama 34B InstructQ2_K–Not viable
Codestral 22BQ3_K_M–Not viable
Command R 35BQ2_K–Not viable
DeepSeek Coder V2 Lite 16BQ3_K_M52 tok/sGood
DeepSeek R1 Distill Qwen 32BQ2_K–Not viable
DeepSeek R1 Distill Qwen 7BQ4_K_M37 tok/sGood
DeepSeek V3Q2_K–Not viable
FLUX.1 DevQ4_K_M–Marginal
Gemma 2 27B InstructQ3_K_M–Not viable
Gemma 2 9B InstructQ4_K_M32 tok/sGood
Gemma 3 12BQ3_K_M21 tok/sMarginal
Gemma 3 27BQ3_K_M–Not viable
Gemma 3 4BQ5_K_M63 tok/sExcellent
Gemma 4 E2BQ8_047 tok/sGood
Gemma 4 E4BQ6_K37 tok/sGood
GigaChat Lightning 10BQ4_K_M74 tok/sAcceptable
InternLM 2.5 7B ChatQ4_K_M35 tok/sGood
Llama 3.1 70B InstructQ2_K–Not viable
Llama 3.1 8B InstructQ4_K_M37 tok/sGood
Llama 3.2 1B InstructQ8_0109 tok/sExcellent
Llama 3.2 3B InstructQ8_075 tok/sExcellent
Llama 3.3 70B InstructQ2_K–Not viable
LLaVA 1.6 13BQ3_K_M25 tok/sMarginal
Mistral 7B Instruct v0.3Q4_K_M36 tok/sGood
Mistral Small 24B InstructQ3_K_M–Not viable
Mixtral 8x7B InstructQ4_K_M–Not viable
nomic-embed-text v1.5Q8_04830 tok/sExcellent
NVIDIA Nemotron-3-super-120B-A12BQ2_K–Not viable
Phi-3 Medium 14B InstructQ3_K_M23 tok/sMarginal
Phi-3 Mini 3.8B InstructQ5_K_M60 tok/sExcellent
Phi-4 14BQ3_K_M22 tok/sMarginal
Phi-4 MiniQ5_K_M63 tok/sExcellent
Qwen 2.5 14B InstructQ3_K_M20 tok/sMarginal
Qwen 2.5 72B InstructQ2_K–Not viable
Qwen 2.5 7B InstructQ4_K_M35 tok/sGood
Qwen 2.5 Coder 32B InstructQ2_K–Not viable
Qwen 2.5 Coder 7B InstructQ4_K_M36 tok/sGood
Qwen3-14B InstructQ3_K_M21 tok/sAcceptable
Qwen3-8B InstructQ5_K_M28 tok/sAcceptable
Qwen3.5-27BQ3_K_M–Not viable
Qwen3.5-397B (MoE)Q2_K–Not viable
Qwen3.6-27BQ3_K_M–Not viable
QwQ 32B PreviewQ2_K–Not viable
Stable Diffusion 3 MediumFP16–Good
Stable Diffusion 3.5 LargeQ8_0–Not viable
Stable Diffusion XL 1.0FP16–Good
StarCoder 2 15BQ3_K_M18 tok/sMarginal
Whisper Large V3Q5_K_M–Excellent
Whisper Large V3 TurboFP16–Excellent
Yi 1.5 34B ChatQ2_K–Not viable

Showing 52 of 52 entries

Buy Used

Prices and availability vary. Inspect hardware before purchasing. Some links may be affiliate links.

eBayMarketplacer/hardwareswap
FAQ

Frequently Asked Questions

What AI models can NVIDIA GeForce RTX 5060 8GB run?
The NVIDIA GeForce RTX 5060 8GB can run 52 AI models. Top performers include all-MiniLM-L6-v2, nomic-embed-text v1.5, Llama 3.2 1B Instruct. See the full compatibility table above for speeds and quality ratings.
Is NVIDIA GeForce RTX 5060 8GB good for AI coding?
With 8 GB, the NVIDIA GeForce RTX 5060 8GB has limited VRAM for AI coding workflows.
How much VRAM does NVIDIA GeForce RTX 5060 8GB have?
The NVIDIA GeForce RTX 5060 8GB has 8 GB of GDDR7 VRAM with 448 GB/s bandwidth.
Can NVIDIA GeForce RTX 5060 8GB run 70B models?
70B models can run on the NVIDIA GeForce RTX 5060 8GB with CPU offloading, but performance will be reduced. Consider a GPU with 48GB+ VRAM for full-speed 70B inference.
Is NVIDIA GeForce RTX 5060 8GB worth it for AI?
At $299, the NVIDIA GeForce RTX 5060 8GB offers 8 GB VRAM and runs 52 AI models. It works for smaller models and experimentation.
Your Rig

Own this GPU?

See every AI model it supports, expected performance, and how to build around it.

Check my rig

Related Guides

Buying Guide

RX 9060 XT vs RTX 5060: which budget GPU wins for local AI?

Same $299 entry point, different ecosystems. We compare VRAM tiers, memory bandwidth, model counts from our compatibility matrix, and when AMD ROCm is worth the friction.

All GPUs