NVIDIA
Laptop GPU
Laptop GPU

NVIDIA RTX 4060 Laptop (40-60W)

8 GB GDDR6 Β· 256 GB/s

Pricing

Included in laptop

Not sold as a standalone component

VRAM

8 GB

Bandwidth

256 GB/s

TDP

40W

Models

52

Tier

Limited

The NVIDIA RTX 4060 Laptop (40-60W) with 8 GB GDDR6 VRAM can handle 52 AI models across embedding, ai_building, coding. Best performance: all-MiniLM-L6-v2 at 5100 tok/s (excellent). Current price has not been announced.

Source: OwnRig methodology

VRAM

8 GB

Bandwidth

256 GB/s

Memory Type

GDDR6

TDP

40W

Form Factor

Laptop (soldered)

Laptop Performance Note

Laptop GPU performance varies by manufacturer, cooling design, and power limits. The tok/s numbers below reflect sustained performance after thermal throttling, not peak. Actual results on your specific laptop may differ by 10-20%.

Builder Capability: Limited

Insufficient VRAM for most AI coding workflows.

Software

Inference Backends

The software stacks that matter most for real-world inference on this device.

CUDA

production

Primary high-performance backend for NVIDIA inference workloads.

Vulkan

stable

Fallback backend for llama.cpp and related local runtimes.

What it can run

52 models
all-MiniLM-L6-v2FP165100 tok/sExcellent
Arcee Trinity Nano 6BQ8_045 tok/sExcellent
Code Llama 34B InstructQ2_K–Not viable
Codestral 22BQ3_K_M–Not viable
Command R 35BQ2_K–Not viable
DeepSeek Coder V2 Lite 16BQ3_K_M27 tok/sGood
DeepSeek R1 Distill Qwen 32BQ2_K–Not viable
DeepSeek R1 Distill Qwen 7BQ4_K_M19 tok/sAcceptable
DeepSeek V3Q2_K–Not viable
FLUX.1 DevQ4_K_M–Marginal
Gemma 2 27B InstructQ3_K_M–Not viable
Gemma 2 9B InstructQ4_K_M17 tok/sAcceptable
Gemma 3 12BQ3_K_M11 tok/sAcceptable
Gemma 3 27BQ3_K_M–Not viable
Gemma 3 4BQ5_K_M33 tok/sGood
Gemma 4 E2BQ8_038 tok/sGood
Gemma 4 E4BQ6_K30 tok/sGood
GigaChat Lightning 10BQ4_K_M48 tok/sAcceptable
InternLM 2.5 7B ChatQ4_K_M18 tok/sAcceptable
Llama 3.1 70B InstructQ2_K–Not viable
Llama 3.1 8B InstructQ4_K_M19 tok/sAcceptable
Llama 3.2 1B InstructQ8_057 tok/sExcellent
Llama 3.2 3B InstructQ8_039 tok/sGood
Llama 3.3 70B InstructQ2_K–Not viable
LLaVA 1.6 13BQ3_K_M13 tok/sAcceptable
Mistral 7B Instruct v0.3Q4_K_M19 tok/sAcceptable
Mistral Small 24B InstructQ3_K_M–Not viable
Mixtral 8x7B InstructQ4_K_M–Not viable
nomic-embed-text v1.5Q8_02520 tok/sExcellent
NVIDIA Nemotron-3-super-120B-A12BQ2_K–Not viable
Phi-3 Medium 14B InstructQ3_K_M12 tok/sAcceptable
Phi-3 Mini 3.8B InstructQ5_K_M31 tok/sGood
Phi-4 14BQ3_K_M11 tok/sAcceptable
Phi-4 MiniQ5_K_M33 tok/sGood
Qwen 2.5 14B InstructQ3_K_M10 tok/sAcceptable
Qwen 2.5 72B InstructQ2_K–Not viable
Qwen 2.5 7B InstructQ4_K_M18 tok/sAcceptable
Qwen 2.5 Coder 32B InstructQ2_K–Not viable
Qwen 2.5 Coder 7B InstructQ4_K_M19 tok/sAcceptable
Qwen3-14B InstructQ3_K_M11 tok/sAcceptable
Qwen3-8B InstructQ5_K_M14 tok/sAcceptable
Qwen3.5-27BQ3_K_M–Not viable
Qwen3.5-397B (MoE)Q2_K–Not viable
Qwen3.6-27BQ3_K_M–Not viable
QwQ 32B PreviewQ2_K–Not viable
Stable Diffusion 3 MediumFP16–Good
Stable Diffusion 3.5 LargeQ8_0–Not viable
Stable Diffusion XL 1.0FP16–Good
StarCoder 2 15BQ3_K_M10 tok/sAcceptable
Whisper Large V3Q5_K_M–Excellent
Whisper Large V3 TurboFP16–Excellent
Yi 1.5 34B ChatQ2_K–Not viable

Showing 52 of 52 entries

Looking for a desktop build?

Desktop GPUs offer higher sustained performance with no thermal throttling. Check our curated desktop builds for dedicated AI workstations.

FAQ

Frequently Asked Questions

What AI models can NVIDIA RTX 4060 Laptop (40-60W) run?
The NVIDIA RTX 4060 Laptop (40-60W) can run 52 AI models. Top performers include all-MiniLM-L6-v2, nomic-embed-text v1.5, Llama 3.2 1B Instruct. See the full compatibility table above for speeds and quality ratings.
Is NVIDIA RTX 4060 Laptop (40-60W) good for AI coding?
With 8 GB, the NVIDIA RTX 4060 Laptop (40-60W) has limited VRAM for AI coding workflows.
How much VRAM does NVIDIA RTX 4060 Laptop (40-60W) have?
The NVIDIA RTX 4060 Laptop (40-60W) has 8 GB of GDDR6 VRAM with 256 GB/s bandwidth.
Can NVIDIA RTX 4060 Laptop (40-60W) run 70B models?
70B models can run on the NVIDIA RTX 4060 Laptop (40-60W) with CPU offloading, but performance will be reduced. Consider a device with 48GB+ inference memory for full-speed 70B inference.
Is NVIDIA RTX 4060 Laptop (40-60W) worth it for AI?
Pricing for NVIDIA RTX 4060 Laptop (40-60W) has not been announced. It offers 8 GB GDDR6 VRAM, but OwnRig should treat recommendations as provisional until pricing and benchmarks are available.

Own this GPU?

See every AI model it supports, expected performance, and how to build around it.

Check my rig