Laptop
ASUS ROG Strix G16 (RTX 4070 Laptop)
Windows
ROG Strix G16 with GeForce RTX 4070 Laptop GPU (80W class), 32GB RAM (representative SKU).
Memory
8 GB
GPUs
1Γ
RAM
32 GB
Models
52
Type
Laptop
Inference Memory
8 GB
Accelerator
8 GB GDDR6
System RAM
32 GB
CPU
Intel Core i9-14900HX
OS
Windows
Laptop Performance Note
Laptop GPU performance varies by cooling and power limits. Listed tok/s reflect sustained loads, not short bursts; your unit may differ.
What it can run
52 models| all-MiniLM-L6-v2 | FP16 | 5950 tok/s | Excellent |
| Arcee Trinity Nano 6B | Q8_0 | 45 tok/s | Excellent |
| Code Llama 34B Instruct | Q2_K | β | Not viable |
| Codestral 22B | Q3_K_M | β | Not viable |
| Command R 35B | Q2_K | β | Not viable |
| DeepSeek Coder V2 Lite 16B | Q3_K_M | 31 tok/s | Good |
| DeepSeek R1 Distill Qwen 32B | Q2_K | β | Not viable |
| DeepSeek R1 Distill Qwen 7B | Q4_K_M | 22 tok/s | Acceptable |
| DeepSeek V3 | Q2_K | β | Not viable |
| FLUX.1 Dev | Q4_K_M | β | Marginal |
| Gemma 2 27B Instruct | Q3_K_M | β | Not viable |
| Gemma 2 9B Instruct | Q4_K_M | 20 tok/s | Acceptable |
| Gemma 3 12B | Q3_K_M | 13 tok/s | Acceptable |
| Gemma 3 27B | Q3_K_M | β | Not viable |
| Gemma 3 4B | Q5_K_M | 39 tok/s | Good |
| Gemma 4 E2B | Q8_0 | 38 tok/s | Good |
| Gemma 4 E4B | Q6_K | 30 tok/s | Good |
| GigaChat Lightning 10B | Q4_K_M | 56 tok/s | Acceptable |
| InternLM 2.5 7B Chat | Q4_K_M | 21 tok/s | Acceptable |
| Llama 3.1 70B Instruct | Q2_K | β | Not viable |
| Llama 3.1 8B Instruct | Q4_K_M | 22 tok/s | Acceptable |
| Llama 3.2 1B Instruct | Q8_0 | 67 tok/s | Excellent |
| Llama 3.2 3B Instruct | Q8_0 | 46 tok/s | Good |
| Llama 3.3 70B Instruct | Q2_K | β | Not viable |
| LLaVA 1.6 13B | Q3_K_M | 15 tok/s | Acceptable |
| Mistral 7B Instruct v0.3 | Q4_K_M | 22 tok/s | Acceptable |
| Mistral Small 24B Instruct | Q3_K_M | β | Not viable |
| Mixtral 8x7B Instruct | Q4_K_M | β | Not viable |
| nomic-embed-text v1.5 | Q8_0 | 2940 tok/s | Excellent |
| NVIDIA Nemotron-3-super-120B-A12B | Q2_K | β | Not viable |
| Phi-3 Medium 14B Instruct | Q3_K_M | 14 tok/s | Acceptable |
| Phi-3 Mini 3.8B Instruct | Q5_K_M | 36 tok/s | Good |
| Phi-4 14B | Q3_K_M | 13 tok/s | Acceptable |
| Phi-4 Mini | Q5_K_M | 39 tok/s | Good |
| Qwen 2.5 14B Instruct | Q3_K_M | 12 tok/s | Acceptable |
| Qwen 2.5 72B Instruct | Q2_K | β | Not viable |
| Qwen 2.5 7B Instruct | Q4_K_M | 21 tok/s | Acceptable |
| Qwen 2.5 Coder 32B Instruct | Q2_K | β | Not viable |
| Qwen 2.5 Coder 7B Instruct | Q4_K_M | 22 tok/s | Acceptable |
| Qwen3-14B Instruct | Q3_K_M | 13 tok/s | Acceptable |
| Qwen3-8B Instruct | Q5_K_M | 16 tok/s | Acceptable |
| Qwen3.5-27B | Q3_K_M | β | Not viable |
| Qwen3.5-397B (MoE) | Q2_K | β | Not viable |
| Qwen3.6-27B | Q3_K_M | β | Not viable |
| QwQ 32B Preview | Q2_K | β | Not viable |
| Stable Diffusion 3 Medium | FP16 | β | Good |
| Stable Diffusion 3.5 Large | Q8_0 | β | Not viable |
| Stable Diffusion XL 1.0 | FP16 | β | Good |
| StarCoder 2 15B | Q3_K_M | 11 tok/s | Acceptable |
| Whisper Large V3 | Q5_K_M | β | Excellent |
| Whisper Large V3 Turbo | FP16 | β | Excellent |
| Yi 1.5 34B Chat | Q2_K | β | Not viable |
Showing 52 of 52 entries
Best Fit
Who this machine makes sense for
This machine is best for people who need portable local AI and are willing to trade some sustained throughput for mobility. 8 GB is enough to make the form factor relevant, not just convenient.
Before You Buy
What to verify first
Pay attention to sustained power limits, fan noise, and battery-mode behavior. Laptop AI performance is often bounded by cooling and power policy more than spec-sheet peak numbers.