Fine-tuning

Chapter 5: Fine-tuning

Learn how to adapt pretrained LLMs to specific tasks and domains. Master the spectrum of fine-tuning approaches from full parameter updates to parameter-efficient methods like LoRA and QLoRA, understand instruction tuning that transforms base models into helpful assistants, and learn best practices for data preparation and evaluation.

Pretraining gives an LLM broad language understanding, but a pretrained model is just a next-token predictor---it does not follow instructions, answer questions helpfully, or refuse harmful requests. Fine-tuning bridges this gap by adapting the model to specific behaviors or domains using curated datasets.

There are two major paradigms of fine-tuning. Full fine-tuning updates all model parameters, achieving the highest quality but requiring significant compute (often 10-100x less than pretraining, but still substantial for large models). Parameter-efficient fine-tuning (PEFT) methods like LoRA update only a small fraction of parameters, making it possible to fine-tune 70B+ models on a single GPU.

The most impactful form of fine-tuning for modern LLMs is instruction tuning: training on (instruction, response) pairs to make the model follow user requests. This is what transforms a base model like Llama into a chat model like Llama-Chat. Combined with RLHF (Chapter 6), instruction tuning produces the helpful, harmless assistants we interact with today.

This chapter covers:

Transfer Learning: The theoretical foundation for why fine-tuning works
Full Fine-tuning: When and how to update all parameters
LoRA & QLoRA: Parameter-efficient fine-tuning that democratized LLM adaptation
Instruction Tuning: Creating helpful assistants from base models
Data Preparation: Building high-quality fine-tuning datasets

Chapter 5: Fine-tuning

Chapter Overview

Chapter Roadmap

Transfer Learning

Full Fine-tuning

LoRA & QLoRA

Instruction Tuning

Data Preparation

Sign up to unlock this chapter