Image generation has a more complicated VRAM story than language models. Language models just need weights to fit; image generation adds a denoising loop, a VAE, and a text encoder, all running simultaneously. The VRAM math compounds.
And the model landscape shifted significantly in 2025 and 2026. FLUX.1 Dev changed the benchmark for local image quality, but it comes with steep VRAM requirements. SD 3 and SD 3.5 occupy the middle ground. SDXL is still surprisingly capable on budget hardware. Here's how each model maps to actual consumer GPUs.
The image gen model landscape in 2026
Four models are worth benchmarking against your hardware right now. Here's the baseline VRAM picture:
| Model | FP16 VRAM | Q8_0 VRAM | Q4_K_M VRAM | Best for |
|---|---|---|---|---|
| Stable Diffusion XL 1.0 | 6.5 GB | N/A | N/A | Fast, versatile 1024px generation |
| Stable Diffusion 3 Medium | 5 GB | N/A | N/A | High coherence, great text rendering |
| Stable Diffusion 3.5 Large | 12.5 GB | 9 GB | N/A | Best SD3 quality; photorealism |
| FLUX.1 Dev | 23.8 GB | 13 GB | 7.2 GB | Top-tier quality; detail and realism |
SDXL and SD3 Medium are the accessible ones. Both run at FP16 on 8 GB cards. FLUX.1 Dev is where the VRAM conversation gets serious.
SDXL and SD3 Medium: the 8 GB tier
If you own an RTX 4060, RTX 3070, or any card with 8 to 12 GB of VRAM, SDXL and Stable Diffusion 3 Medium are your primary image generation tools. Both run at full FP16 quality within 8 GB.
SDXL is fast, well-supported, and has the richest ecosystem of LoRAs and fine-tunes. SD3 Medium has better text rendering and prompt coherence. For general photorealistic images, SD3 Medium edges ahead. For artistic styles and specialized fine-tunes, SDXL's ecosystem wins.
5 GB
VRAM needed for Stable Diffusion 3 Medium at FP16
Runs on any GPU with 6 GB or more, including budget options
Neither model pushes consumer GPUs. A 12 GB card like the RTX 4070 Ti runs them with roughly 5 to 7 GB to spare, which means you can generate at higher resolutions or run larger batch sizes without hitting the VRAM wall.
FLUX.1 Dev: the demanding one
FLUX.1 Dev is the most capable open image generation model available in 2026, and it earns that title by demanding hardware. At FP16, it needs 23.8 GB. That's right at the edge of a 24 GB card.
The quantization options change the math significantly:
| Quantization | VRAM needed | Fits in | Quality impact |
|---|---|---|---|
| FP16 | 23.8 GB | 24 GB-class discrete GPUs and up | Full tier in our model data |
| Q8_0 | 13 GB | 16 GB cards on paper; verify exact runtime support per device | Recommended tier in our model data |
| Q4_K_M | 7.2 GB | RTX 4060 8 GB, any 8 GB card | Efficient tier in our model data |
Per-GPU verdict
Every major consumer GPU in our database, with honest verdicts on what you can actually run:
| GPU | VRAM | SDXL | SD3 Medium | FLUX.1 Dev |
|---|---|---|---|---|
| RTX 4060 8GB | 8 GB | Full quality | Full quality | Q4 only (degraded) |
| RTX 4060 Ti 16GB | 16 GB | Full quality | Full quality | Q4_K_M path in matrix |
| RTX 4070 Ti 12GB | 12 GB | Full quality | Full quality | Q8_0 is not viable in matrix |
| RTX 4090 | 24 GB | Full quality | Full quality | FP16 in matrix |
Recommendations by budget
I'll be direct about what I'd buy at each price point.
Under $400: The RTX 4060 8 GB (about $289 in our device data) handles SDXL and SD3 Medium at full quality. For FLUX, you're stuck at Q4, which is workable but not ideal. If image generation is your main use, save up.
$400 to $500: The RTX 4060 Ti 16 GB is the right buy for SDXL, SD3 Medium, and SD3.5 Large. This is the recommended card for most image generation setups if FLUX FP16 is not the main goal.
$1,500 and up: An RTX 4090 at about $1,799 in our device data is the cleanest consumer path to FLUX.1 Dev at FP16. It gives you 24 GB of VRAM, full FLUX quality, and enough headroom to make high-res batch generation practical.
