PKU Class 2024 Spring: Large Language Model in Decision Intelligence
Instructor: Kun Yuan (kunyuan@pku.edu.cn)
Teaching assistants:
- Yudong Bai (yutonghe@pku.edu.cn)
- Yunteng Geng (2301213081@pku.edu.cn)
- Yutong He (yutonghe@pku.edu.cn)
- Peijin Li (2301213056@stu.pku.edu.cn)
- Zihao Liu (2100011704@stu.pku.edu.cn)
- Keer Lu (2301213094@stu.pku.edu.cn)
- Yilong Song (2301213059@pku.edu.cn)
- Qianyou Sun (2301111049@stu.pku.edu.cn)
- Yuchi Wang (wangyuchi@stu.pku.edu.cn)
Office hour: 4pm - 5pm Wednesday, 静园六院220
References
Stanford CS224n: Natural Language Processing with Deep Learning
Lectrures
Lecture 1: Introduction to LLM
- Introduction to large language model [Slides]
- Reading:
Lecture 2: Linear algebra and optimization
Lecture 3: Basics in machine learning
- Linear regression; Logistic regression; Multi-classification; Neural network [Slides]
- Reading:
Lecture 4: Word embedding and language models
- Word embedding; [Slides]
- Language models; Recurrent neural network; (Slides are adapted from Stanford CS224n RNN)
- Back propogation in RNN [Slides]
- Sequence-to-sequence model (Slides are adapted from Stanford CS224n Seq2Seq)
- Forward-Backward propogation [Hand-written materials]
- Transformers (Slides are adapted from Stanford CS224n Transformers)
- Parameters and Computations in Transformers [Slides]
- Reading:
Guest Lecture I:
- Large language model in mathematical reasoning (Dr. Jihai Zhang, Alibaba DAMO Academcy)
Lecture 6: Pretrain and Fine-tune Paradigm
- Teacher forcing; Pretrain; Fine-tune; BERT; GPTs [Slides]
- Reading:
Lecture 7: Optimizers
Midterm Exam
Lecture 8: Distributed Training
- Scaling law [Slides]
- Data parallelism and communication saving; Pipeline parallelism; Tensor parallelism [Slides]
Lecture 9: Data Prepation
- Data source; Deduplication; Quality filtering; Sensitive information reduction; Data composition; Data curriculum [Slides]
Lecture 10: Principals in Prompt Engineering
- Pricipals in Prompt Engineering [Slides]
- Reading: