Kun Yuan

Lectures 2026: Optimization for Large Language Models

Instructor: Kun Yuan (kunyuan@pku.edu.cn)

This is a 10-hour intensive course on Optimization for Large Language Models. I would like to express my sincere gratitude to the Operations Research Society of China for the invitation and excellent organization.

Classroom: To be announced

Time: To be annouced

References

Martin Jaggi and Nicolas Flammarion, Optimization for Machine Learning, EPFL Class CS-439
Chris De Sa, Advanced Machine Learning Systems, Cornell CS6787
Zaiwen Wen, Optimization Methods, PKU 2024 Fall
Kun Yuan, Introduction to LLM, PKU 2025 Spring

Materials

Lecture 1: Gradient Descent

Lecture 2: Stochastic Gradient Descent

Lecture 3: LLM Foundations - Part I

Lecture 4: LLM Foundations - Part II

Lecture 5: Costs in LLM Pre-Training

Lecture 6: Perturbed SGD and Mixed-Precision Training

Lecture 7: Coordinate Descent and Layer-wise Training

Lecture 8: Subspace Optimization and Low-Rank Training

Lecture 9: Zeroth-order Optimization and Activation-Free LLM Training

Lecture 10: Distributed Optimization for LLM Training