PKU Class 2025 Spring: Introduction to Foundation Models
Instructor: Kun Yuan (kunyuan@pku.edu.cn)
Teaching assistants:
- Jie Hu (hujie@stu.pku.edu.cn)
- Yipeng Hu (2301213082@stu.pku.edu.cn)
- Qiulin Shang (2100013145@stu.pku.edu.cn)
- Yilong Song (2301213059@pku.edu.cn)
Office hour: 4pm - 5pm Wednesday, 静园六院220
References
Stanford CS224n: Natural Language Processing with Deep Learning
Lectrures
Lecture 1: Introduction to LLM
Lecture 2: Basics in machine learning
- Warm up: Preliminary [Notes]
- Linear regression; Logistic regression; Multi-classification; Neural network [Slides]
- Reading:
Lecture 3: Gradient descent
- Convex set; Convex functions; Convex problems; Gradient descent [Slides] [Notes]
- Forward-backward propagation [Notes]
Lecture 4: Stochastic gradient descent
- Stochastic gradient descent (SGD); mini-batch SGD [Slides] [Notes]
- Mini-batch forward-backward propagation [Slides]
- Reading:
Lecture 5: Adavanced optimizers
- Momentum SGD; Nesterov SGD [Slides]
- Adaptive SGD; AdaGrad; RMSProp; Adam [Slides]
Lecture 6: Language models
- Word embedding; Language models; Recurrent neural networks [Slides]
- Reading:
- Seq2seq models; cross-attention; self-attention; transformers [Slides]
- Reading: