How much VRAM does Whisper Large V3 Turbo need?

Whisper Large V3 Turbo requires 1.6 GB VRAM at recommended quality (FP16).

What is the best GPU for Whisper Large V3 Turbo?

The NVIDIA Grace Blackwell Ultra GB300 delivers the best performance for Whisper Large V3 Turbo, achieving 600 tok/s at FP16 with an excellent rating.

Can I run Whisper Large V3 Turbo on an RTX 4060 Ti?

Yes. On the NVIDIA GeForce RTX 4060 Ti 16GB, Whisper Large V3 Turbo runs at – (FP16, excellent).

Transcription810M

Transcription

Whisper Large V3 Turbo

Whisper · MIT

Distilled version of Whisper Large V3. 8x faster than the full model with minimal quality loss. The go-to for real-time transcription on local hardware. Runs comfortably on any GPU with 2GB+ VRAM.

Parameters: 810M
Architecture: Dense
Context: 1,500 tokens
Released: 2024-10-01
Engines: whisper.cpp, faster-whisper, transformers

Parameters

810M

VRAM

1.6 GB

Context

Formats

GPUs

Whisper Large V3 Turbo (810M) requires 1.6 GB VRAM at recommended quality (FP16). On NVIDIA Grace Blackwell Ultra GB300, expect approximately 600 tok/s at FP16. For the best experience, Starter AI Desktop ($582) is recommended.

Source: OwnRig methodology

VRAM (Recommended)

1.6 GB

Quantization

FP16

File Size

1.5 GB

Max Context

1K tokens

Primary Use

Transcription

Compatible GPUs

43 devices


NVIDIA Grace Blackwell Ultra GB300	FP16	600 tok/s	Excellent
Apple M4 Max (128GB Unified)	FP16	–	Excellent
Apple M4 Max (36GB Unified)	FP16	–	Excellent
Apple M4 Max (64GB Unified)	FP16	–	Excellent
Apple M4 Pro (24GB Unified)	FP16	–	Excellent
Apple M4 Pro (48GB)	FP16	–	Excellent
Apple M4 Ultra (192GB)	FP16	–	Excellent
NVIDIA GeForce RTX 3060 12GB	FP16	–	Excellent
NVIDIA GeForce RTX 3080 10GB	FP16	–	Excellent
NVIDIA GeForce RTX 3090	FP16	–	Excellent
NVIDIA GeForce RTX 4060 8GB	FP16	–	Excellent
NVIDIA RTX 4060 Laptop (40-60W)	FP16	–	Excellent
NVIDIA GeForce RTX 4060 Ti 16GB	FP16	–	Excellent
NVIDIA RTX 4070 Laptop (80-115W)	FP16	–	Excellent
NVIDIA GeForce RTX 4070 Super	FP16	–	Excellent
NVIDIA GeForce RTX 4070 Ti 12GB	FP16	–	Excellent
NVIDIA GeForce RTX 4070 Ti Super	FP16	–	Excellent
NVIDIA RTX 4080 Laptop (120-150W)	FP16	–	Excellent
NVIDIA GeForce RTX 4080 Super	FP16	–	Excellent
NVIDIA GeForce RTX 4090	FP16	–	Excellent
NVIDIA RTX 4090 Laptop (150-175W)	FP16	–	Excellent
NVIDIA GeForce RTX 5080	FP16	–	Excellent
NVIDIA GeForce RTX 5090	FP16	–	Excellent
AMD Radeon Pro W7900	FP16	–	Excellent
NVIDIA RTX PRO 6000 Blackwell	FP16	–	Excellent
NVIDIA RTX PRO 6000 Blackwell Max-Q	FP16	–	Excellent
NVIDIA GeForce RTX 5060 8GB	FP16	–	Excellent
NVIDIA GeForce RTX 5060 Ti 16GB	FP16	–	Excellent
Apple M3 Pro (18GB Unified)	FP16	–	Good
Apple M4 (16GB Unified)	FP16	–	Good
AMD Radeon RX 7600	FP16	–	Good
AMD Radeon RX 7900 XTX	FP16	–	Good
AMD Radeon RX 9070	FP16	–	Good
Apple M1 (8GB Unified)	FP16	–	Good
Apple M1 (16GB Unified)	FP16	–	Good
Apple M1 Pro (16GB Unified)	FP16	–	Good
Apple M2 (8GB Unified)	FP16	–	Good
Apple M2 (16GB Unified)	FP16	–	Good
Apple M2 Pro (16GB Unified)	FP16	–	Good
Apple M3 (8GB Unified)	FP16	–	Good
Apple M3 (16GB Unified)	FP16	–	Good
AMD Radeon RX 9060 XT 16GB	FP16	–	Good
AMD Radeon RX 9060 XT 8GB	FP16	–	Good

Showing 43 of 43 entries

FAQ

Frequently Asked Questions

How much VRAM does Whisper Large V3 Turbo need?: Whisper Large V3 Turbo requires 1.6 GB VRAM at recommended quality (FP16).
What is the best GPU for Whisper Large V3 Turbo?: The NVIDIA Grace Blackwell Ultra GB300 delivers the best performance for Whisper Large V3 Turbo, achieving 600 tok/s at FP16 with an excellent rating.
Can I run Whisper Large V3 Turbo on an RTX 4060 Ti?: Yes. On the NVIDIA GeForce RTX 4060 Ti 16GB, Whisper Large V3 Turbo runs at – (FP16, excellent).

Related Guides

Tutorial

Running Whisper locally: GPU requirements and setup

Whisper Large V3 and V3 Turbo GPU requirements, VRAM usage, and hardware recommendations. Any GPU with 4 GB handles it; here is what you actually need for production use.

Buying Guide

Mac Mini M4 for AI: which models run on 16 GB

Which AI models run on the Mac Mini M4 with 16 GB, 24 GB, or 48 GB of unified memory. Honest compatibility table, real quantization requirements, and the upgrade case for M4 Pro.

All models

Data confidence: estimated. Source

VRAM requirements are calculated from model parameters and may vary by inference engine, context length, and batch size. Performance estimates are based on community benchmarks and should be verified for your specific configuration.Whisper is a trademark of its respective owner. OwnRig is not affiliated with or endorsed by the model creator.