NVIDIA
Laptop GPU
Laptop GPU

NVIDIA RTX 4080 Laptop (120-150W)

12 GB GDDR6 Β· 384 GB/s

Pricing

Included in laptop

Not sold as a standalone component

VRAM

12 GB

Bandwidth

384 GB/s

TDP

120W

Models

40

Tier

Starter

The NVIDIA RTX 4080 Laptop (120-150W) with 12 GB GDDR6 VRAM can handle 40 AI models across embedding, ai_building, coding. Best performance: all-MiniLM-L6-v2 at 8400 tok/s (excellent). For AI coding workflows, it supports the Starter AI Coding tier, good for 7–8B models. Current price has not been announced.

Source: OwnRig methodology

VRAM

12 GB

Bandwidth

384 GB/s

Memory Type

GDDR6

TDP

120W

Form Factor

Laptop (soldered)

Laptop Performance Note

Laptop GPU performance varies by manufacturer, cooling design, and power limits. The tok/s numbers below reflect sustained performance after thermal throttling, not peak. Actual results on your specific laptop may differ by 10-20%.

Builder Capability: Starter AI Coding

Runs 7-8B models comfortably. Good for basic local code completion and small model experiments.

Software

Inference Backends

The software stacks that matter most for real-world inference on this device.

CUDA

production

Primary high-performance backend for NVIDIA inference workloads.

Vulkan

stable

Fallback backend for llama.cpp and related local runtimes.

What it can run

40 models
all-MiniLM-L6-v2FP168400 tok/sExcellent
Arcee Trinity Mini 26BQ3_K_M6 tok/sNot viable
Arcee Trinity Nano 6BQ8_068 tok/sExcellent
Codestral 22BQ3_K_M8 tok/sMarginal
DeepSeek Coder V2 Lite 16BQ4_K_M39 tok/sGood
DeepSeek R1 Distill Qwen 7BQ5_K_M34 tok/sGood
Gemma 2 9B InstructQ5_K_M34 tok/sGood
Gemma 3 12BQ4_K_M22 tok/sAcceptable
Gemma 3 4BQ8_059 tok/sExcellent
Gemma 4 26B-A4BQ3_K_M8 tok/sNot viable
Gemma 4 E2BQ8_057 tok/sGood
Gemma 4 E4BQ8_035 tok/sGood
GigaChat Lightning 10BQ4_K_M80 tok/sAcceptable
InternLM 2.5 7B ChatQ5_K_M32 tok/sGood
Llama 3.1 8B InstructQ5_K_M36 tok/sGood
Llama 3.2 1B InstructQ8_098 tok/sExcellent
Llama 3.2 3B InstructQ8_067 tok/sExcellent
LLaVA 1.6 13BQ4_K_M20 tok/sAcceptable
Mistral 7B Instruct v0.3Q5_K_M35 tok/sGood
nomic-embed-text v1.5Q8_04550 tok/sExcellent
NVIDIA Nemotron-3-super-120B-A12BQ2_K–Not viable
Phi-3 Medium 14B InstructQ3_K_M25 tok/sGood
Phi-3 Mini 3.8B InstructQ8_055 tok/sExcellent
Phi-4 14BQ3_K_M24 tok/sAcceptable
Phi-4 MiniQ8_057 tok/sExcellent
Qwen 2.5 14B InstructQ4_K_M21 tok/sAcceptable
Qwen 2.5 7B InstructQ5_K_M34 tok/sGood
Qwen 2.5 Coder 7B InstructQ5_K_M34 tok/sGood
Qwen3-14B InstructQ5_K_M16 tok/sAcceptable
Qwen3-8B InstructQ8_021 tok/sAcceptable
Qwen3.5-122B-A10BQ3_K_M–Not viable
Qwen3.5-27BQ3_K_M8 tok/sMarginal
Qwen3.5-397B (MoE)Q2_K–Not viable
Qwen3.6-27BQ3_K_M–Not viable
Stable Diffusion 3 MediumFP16–Excellent
Stable Diffusion 3.5 LargeQ8_0–Good
Stable Diffusion XL 1.0FP16–Excellent
StarCoder 2 15BQ3_K_M20 tok/sAcceptable
Whisper Large V3FP16–Excellent
Whisper Large V3 TurboFP16–Excellent

Showing 40 of 40 entries

Looking for a desktop build?

Desktop GPUs offer higher sustained performance with no thermal throttling. Check our curated desktop builds for dedicated AI workstations.

FAQ

Frequently Asked Questions

What AI models can NVIDIA RTX 4080 Laptop (120-150W) run?
The NVIDIA RTX 4080 Laptop (120-150W) can run 40 AI models. Top performers include all-MiniLM-L6-v2, nomic-embed-text v1.5, Llama 3.2 1B Instruct. See the full compatibility table above for speeds and quality ratings.
Is NVIDIA RTX 4080 Laptop (120-150W) good for AI coding?
With 12 GB, the NVIDIA RTX 4080 Laptop (120-150W) runs 7-8B coding models at the Starter tier. Good for basic code completion.
How much VRAM does NVIDIA RTX 4080 Laptop (120-150W) have?
The NVIDIA RTX 4080 Laptop (120-150W) has 12 GB of GDDR6 VRAM with 384 GB/s bandwidth.
Can NVIDIA RTX 4080 Laptop (120-150W) run 70B models?
70B models can run on the NVIDIA RTX 4080 Laptop (120-150W) with CPU offloading, but performance will be reduced. Consider a device with 48GB+ inference memory for full-speed 70B inference.
Is NVIDIA RTX 4080 Laptop (120-150W) worth it for AI?
Pricing for NVIDIA RTX 4080 Laptop (120-150W) has not been announced. It offers 12 GB GDDR6 VRAM, but OwnRig should treat recommendations as provisional until pricing and benchmarks are available.

Own this GPU?

See every AI model it supports, expected performance, and how to build around it.

Check my rig