Work

← Back to work

QLoRA Fine-Tuning

Domain-Specific LLM (QLoRA)

QLoRA Fine-Tuning

2025

A niche philosophy community (Taking Children Seriously) had decades of written content but no way to access it conversationally. I trained a general-purpose LLM into a domain expert.

Built the data pipeline using LLM-assisted cleaning, distilled 2,044 noisy articles down to 1,037 clean examples.

Final model: accurate attributions, correct philosophy, proper terminology.

Technical: Fine-tuned Llama 3.2 3B Instruct using QLoRA on a consumer RTX 3060 (12GB). 8-script pipeline: web scraping, 4-strategy data engineering, Claude Haiku data distillation (~$5 for 1,700 rewrites), training via Unsloth (rank=16, ~50 min/run), structured evaluation framework, GGUF export for deployment.

Stack

Python, Unsloth, HuggingFace, PEFT, QLoRA, Claude API, NVIDIA RTX 3060