Introduction to LLM
This page provides an easy-to-understand guide on LLMs (Large Language Models) from basics to applications for AI enthusiasts.
Total of 1 articles available. |
Currently on page 1 of 1.
Chapter 10 — Post-Training and Alignment Mathematics
Tenth post of the LLM Primer II walkthrough. The mathematics that civilizes a brilliant but feral next-word predictor into a helpful assistant — supervised fine-tuning, reward modeling, RLHF on a KL leash, and the elegant DPO derivation that collapses the whole pipeline into a single supervised loss.
2026-03-12