Whisper · MIT
OpenAI's best open speech-to-text model. Supports 99 languages. Near-human accuracy for English. Low VRAM requirements — runs on any GPU. Useful for builders who need voice-to-code or meeting transcription.
Whisper Large V3 (1.55B) requires 1.5 GB VRAM at recommended quality (Q5_K_M). At efficient quality (Q4_K_M), it fits in 1.3 GB VRAM, making it compatible with the NVIDIA GeForce RTX 3060 12GB. For the best experience, Starter AI Desktop ($582) is recommended.
— OwnRig methodology, data updated 2026-03-01
| Quality | Quantization | VRAM | File Size |
|---|---|---|---|
| full | FP16 | 3.1 GB | 3.1 GB |
| recommended | Q5_K_M | 1.5 GB | 0.93 GB |
| efficient | Q4_K_M | 1.3 GB | 0.78 GB |
Performance data for Whisper Large V3 across different hardware.
| Device | Quantization | Speed | Rating | Fits in VRAM |
|---|---|---|---|---|
| NVIDIA GeForce RTX 3060 12GB | FP16 | — | Excellent | ✓ |
| NVIDIA GeForce RTX 4090 | FP16 | — | Excellent | ✓ |
| NVIDIA GeForce RTX 4060 8GB | Q5_K_M | — | Excellent | ✓ |
| NVIDIA GeForce RTX 4070 Ti 12GB | FP16 | — | Excellent | ✓ |
| NVIDIA GeForce RTX 3080 10GB | Q5_K_M | — | Excellent | ✓ |
| Apple M3 Pro (18GB Unified) | Q5_K_M | — | Good | ✓ |
Complete PC builds that can run Whisper Large V3.

NVIDIA GeForce RTX 4090 · 64GB DDR5-5600 (2x32GB)
AMD Radeon RX 7900 XTX 24GB · 32GB DDR5-5600 (2x16GB)

NVIDIA GeForce RTX 3060 12GB · 32GB DDR4-3200 (2x16GB)

NVIDIA GeForce RTX 4060 Ti 16GB · 32GB DDR5-5200 (2x16GB)

2x NVIDIA GeForce RTX 3090 24GB (Used) + NVLink Bridge · 128GB DDR5-5600 (4x32GB)

NVIDIA GeForce RTX 3090 24GB (Used) · 64GB DDR5-5600 (2x32GB)

NVIDIA GeForce RTX 4060 Ti 16GB · 32GB DDR5-5600 (2x16GB)

NVIDIA GeForce RTX 3060 12GB · 16GB DDR4-3200 (2x8GB)
Data confidence: verified. Last updated: 2026-03-01. Source