📖 Fine-tuning LLMs¶

🧠 Why Fine-tune a Language Model?
⚙️ Types of Fine-tuning
🛠️ Fine-tuning Pipeline Overview
📦 Tools and Frameworks
📊 Case Studies / Example Walkthroughs
⚖️ Tradeoffs and Considerations
🧪 Evaluation Best Practices
🔚 Closing Notes

🧠 Why Fine-tune a Language Model?¶

🔄 Prompting Limitations¶

🎯 Use Cases for Fine-tuning¶

🧠 Behavioral vs. Task-Specific Tuning¶

Back to the top

⚙️ Types of Fine-tuning¶

🧰 Full Fine-tuning¶

🧱 Adapter-based Tuning¶

🧪 LoRA (Low-Rank Adaptation)¶

🎛️ Prefix/Prompt Tuning¶

Back to the top

🛠️ Fine-tuning Pipeline Overview¶

📄 Data Collection and Formatting¶

🧹 Preprocessing and Tokenization¶

🔧 Training Setup and Config¶

📉 Evaluation and Checkpoints¶

Back to the top

📦 Tools and Frameworks¶

🤗 Hugging Face Transformers + Datasets¶

🧠 PEFT (Parameter-Efficient Fine-Tuning)¶

🧪 OpenLLM, Axolotl, LoRA Libraries¶

Back to the top

📊 Case Studies / Example Walkthroughs¶

📄 Fine-tuning for Text Classification¶

💬 Fine-tuning for Q&A¶

🤖 Fine-tuning for Chatbots¶

Back to the top

⚖️ Tradeoffs and Considerations¶

💰 Compute and Cost Constraints¶

🧠 Catastrophic Forgetting¶

🔄 Overfitting to Instruction Style¶

Back to the top

🧪 Evaluation Best Practices¶

🧠 Task-specific Metrics¶

🔍 Manual Review of Generations¶

📊 Comparing Baseline vs. Fine-tuned¶

Back to the top

🔚 Closing Notes¶

🧭 Summary and When to Fine-tune¶

🚀 Next Up: Hugging Face Workflows¶

🧠 What to Try on Your Own¶

Back to the top