OwnRig

Model Fine-Tuning & Training

training

Fine-tune language models locally with QLoRA, LoRA, and full fine-tuning. Train custom adapters for domain-specific tasks without sending proprietary data to third-party APIs. VRAM requirements scale with model size and method: QLoRA fine-tuning a 7B model fits in 16GB, while full fine-tuning of 32B models needs 48GB+. System RAM matters — gradient checkpointing and dataset loading use 2-4x the model's VRAM in system memory.

AxolotlUnslothHugging Face TRLLLaMA FactoryPEFT

Concurrent VRAM

10 GB

Peak VRAM

16 GB

Min Bandwidth

400 GB/s

Models Required

3

VRAM Breakdown

How the 10 GB concurrent VRAM is used.

Switched (Loaded As Needed)

These share VRAM with the largest concurrent model — only one runs at a time.

Llama 3.1 8B Instruct(qlora fine tuning target)
10 GB

Q4_K_M

Mistral 7B Instruct v0.3(qlora fine tuning target)
10 GB

Q4_K_M

Qwen 2.5 Coder 7B Instruct(code fine tuning target)
10 GB

Q4_K_M

Local vs API Costs

Typical Monthly API Cost

$200/mo

Break-Even Point

10 months

Annual Savings After Break-Even

~$1920/yr

Based on OpenAI fine-tuning API pricing ($8/1M training tokens, 3-5 fine-tuning runs/month on 50K-row datasets). Local fine-tuning is unlimited iterations with zero per-token cost. Electricity cost ~$15/mo at 6hr/day GPU usage during training. Mid-Range Workstation at ~$1,400. Privacy advantage is the real differentiator — proprietary data never leaves your machine.

Recommended Builds

Pre-configured builds that can run the Model Fine-Tuning & Training workflow.

Prefer a Mac? Apple Silicon with unified memory can run this workflow too. See the Mac AI Builder workflow →

Get a personalized recommendation for this workflow →

Author: Ada. Last updated: 2026-03-14.