13 Fine-tuning

core

LoRA and QLoRA change behavior cheaply, but fine-tuning is the last lever, not the first. Try template, prompt, model, and RAG first.

Adapted from Ahmad Osman, "LLMs 101: A Practical Guide (2026)".

fine-tuningtrainingContinued training of a pretrained base model on a smaller, task-specific dataset to specialize its behavior without retraining from scratch. Open full entry changes model behavior by training on additional data. For local users, the two methods that matter most are LoRAtrainingA parameter-efficient fine-tuning method that injects small low-rank adapter matrices into a frozen base model, training a tiny fraction of weights instead of the full model. Open full entry and QLoRAtrainingA fine-tuning method that combines 4-bit quantization of the frozen base model with LoRA adapters, making large-model fine-tuning fit on a single consumer GPU. Open full entry . LoRA freezes the base model and trains small low-rank adapter weights, which reduces trainable parameters and lets you keep multiple lightweight adapters. QLoRA extends this by fine-tuning through a frozen 4-bit quantized model into LoRA adapters, which is what makes fine-tuning feasible on a single consumer GPU.

Fine-tune when you need a consistent writing style, a domain-specific output format, repetitive classification or extraction behavior, tool-call format reliability, a specialized persona, domain adaptation that retrieval cannot solve, or better small-model performance on a narrow task.

But do not fine-tune first. Try this order: correct chat template, better prompting, a better model, better decoding, RAGretrieval-memoryA pattern where a model retrieves relevant documents from an external store at query time and conditions its answer on them, instead of relying only on parametric knowledge. Open full entry , reranking, few-shot examples, and only then fine-tuning. Most problems that look like “the model does not understand my domain” are actually a vague prompt, a wrong template, or broken retrieval. Fine-tuning is the most expensive lever and the slowest to iterate, so it should be the last one you reach for, not the first.

When you do fine-tune, a sound plan includes clean data, train/validation/test splits, baseline evals, a clear target behavior, a safety review, overfitting and regression checks, adapter versioning, license review, and a rollback plan. The tooling for pretraining and fine-tuning lives at the training layer.