Mistral · Apache 2.0
Fast and capable 7B model with sliding window attention. Good all-rounder, slightly behind Llama 3.1 8B on most benchmarks but fully open-source under Apache 2.0.
Mistral 7B Instruct v0.3 (7.24B) requires 5.3 GB VRAM at recommended quality (Q5_K_M). At efficient quality (Q4_K_M), it fits in 4.5 GB VRAM, making it compatible with the NVIDIA GeForce RTX 3060 12GB. On NVIDIA GeForce RTX 4090, expect approximately 90 tok/s at Q8_0. For the best experience, Starter AI Desktop ($582) is recommended.
— OwnRig methodology, data updated 2026-03-01
| Quality | Quantization | VRAM | File Size |
|---|---|---|---|
| full | Q8_0 | 8.1 GB | 7.2 GB |
| recommended | Q5_K_M | 5.3 GB | 4.3 GB |
| efficient | Q4_K_M | 4.5 GB | 3.6 GB |
| compressed | Q3_K_M | 3.6 GB | 2.8 GB |
KV cache VRAM at Q5_K_M quality. Longer context = more memory.
| Context | KV Cache | Total VRAM |
|---|---|---|
| 2K | 102 MB | 5.4 GB |
| 4K | 205 MB | 5.5 GB |
| 8K | 410 MB | 5.7 GB |
| 16K | 922 MB | 6.2 GB |
| 32K | 1.8 GB | 7.1 GB |
Performance data for Mistral 7B Instruct v0.3 across different hardware.
| Device | Quantization | Speed | Rating | Fits in VRAM |
|---|---|---|---|---|
| NVIDIA GeForce RTX 3060 12GB | Q5_K_M | 33 tok/s | Good | ✓ |
| NVIDIA GeForce RTX 4090 | Q8_0 | 90 tok/s | Excellent | ✓ |
| NVIDIA GeForce RTX 4070 Super | Q5_K_M | 50 tok/s | Excellent | ✓ |
| NVIDIA GeForce RTX 4080 Super | Q8_0 | 78 tok/s | Excellent | ✓ |
| NVIDIA GeForce RTX 4060 8GB | Q4_K_M | 31 tok/s | Good | ✓ |
| NVIDIA GeForce RTX 4070 Ti 12GB | Q5_K_M | 50 tok/s | Excellent | ✓ |
| NVIDIA GeForce RTX 3080 10GB | Q5_K_M | 48 tok/s | Excellent | ✓ |
| Apple M3 Pro (18GB Unified) | Q4_K_M | 14 tok/s | Acceptable | ✓ |
Complete PC builds that can run Mistral 7B Instruct v0.3.

NVIDIA GeForce RTX 3060 12GB · 32GB DDR4-3200 (2x16GB)

NVIDIA GeForce RTX 4060 Ti 16GB · 32GB DDR5-5200 (2x16GB)

NVIDIA GeForce RTX 4070 Super 12GB · 32GB DDR5-5600 (2x16GB)

NVIDIA GeForce RTX 3090 24GB (Used) · 64GB DDR5-5600 (2x32GB)

NVIDIA GeForce RTX 4060 Ti 16GB · 32GB DDR5-5600 (2x16GB)

NVIDIA GeForce RTX 3060 12GB · 16GB DDR4-3200 (2x8GB)
Data confidence: verified. Last updated: 2026-03-01. Source