Mistral · Mistral AI Non-Production License
Mistral's dedicated coding model. Strong at code completion and generation across 80+ languages. Fits on a single 16GB GPU at Q3/Q4. Non-production license limits commercial use.
Codestral 22B (22.2B) requires 15.1 GB VRAM at recommended quality (Q5_K_M). At efficient quality (Q4_K_M), it fits in 12.7 GB VRAM, making it compatible with the NVIDIA GeForce RTX 4060 Ti 16GB. On NVIDIA GeForce RTX 4090, expect approximately 35 tok/s at Q5_K_M. For the best experience, Budget Home AI Server ($1,162) is recommended.
— OwnRig methodology, data updated 2026-03-01
| Quality | Quantization | VRAM | File Size |
|---|---|---|---|
| recommended | Q5_K_M | 15.1 GB | 13.3 GB |
| efficient | Q4_K_M | 12.7 GB | 11.1 GB |
| compressed | Q3_K_M | 10.3 GB | 8.7 GB |
KV cache VRAM at Q5_K_M quality. Longer context = more memory.
| Context | KV Cache | Total VRAM |
|---|---|---|
| 2K | 307 MB | 15.4 GB |
| 4K | 512 MB | 15.6 GB |
| 8K | 1 GB | 16.1 GB |
| 16K | 2 GB | 17.1 GB |
| 32K | 4.1 GB | 19.2 GB |
Performance data for Codestral 22B across different hardware.
| Device | Quantization | Speed | Rating | Fits in VRAM |
|---|---|---|---|---|
| NVIDIA GeForce RTX 4060 Ti 16GB | Q3_K_M | 18 tok/s | Acceptable | ✓ |
| NVIDIA GeForce RTX 4090 | Q5_K_M | 35 tok/s | Excellent | ✓ |
| NVIDIA GeForce RTX 4060 8GB | Q3_K_M | — | Not Viable | ✗ (offload) |
| NVIDIA GeForce RTX 4070 Ti 12GB | Q3_K_M | 12 tok/s | Marginal | ✓ |
| NVIDIA GeForce RTX 3080 10GB | Q3_K_M | — | Not Viable | ✗ (offload) |
| Apple M3 Pro (18GB Unified) | Q3_K_M | — | Not Viable | ✗ (offload) |
Codestral 22B is commonly used with Cursor, Continue, Windsurf. For an AI coding workflow, pair it with an embedding model like nomic-embed-text for local RAG.
Complete PC builds that can run Codestral 22B.
Data confidence: verified. Last updated: 2026-03-01. Source