Skip to main content
Calendar
Week 1
- Sep 23
-
Introduction
- Motivation
- Course Overview
- Logistics
- Sep 27
-
Transformer Architecture
- Sequence Modeling
- RNNs
- Transformers
- Pretraining and Fine-Tuning
- Sep 27
- HW1 Released
Week 2
- Sep 30
-
Hardware Aware Algorithm Design
- Introduction to Compilers and the GPU Memory Hierarchy
- Introduction to Arithmetic Intensity and Measures of Efficiency
- Oct 04
-
Analyzing Transformer Performance
- Measuring the FLOPs of MLP and Transformer Training and Inference (Intro to Backpropagation)
- Reviewing Autoregressive Generation and KV Caching
- Measuring the Efficiency of KV Caching (FLOPs, Arithmetic Intensity)
- Speculative Decoding
Week 3
- Oct 07
-
CUDA and GPU Programming for AI
- Oct 07
- HW1 Due at 9pm, HW2 Released
- Oct 10
- HW1 Grades and Solutions Released
- Oct 11
-
Efficient Attention
- Attention Bottlenecks and Attention Approximations to Improve Efficiency
- I/O Aware Algorithms: FlashAttention
Week 4
- Oct 14
-
Quantization, Sparsity, and Pruning
- Structured Sparsity vs. Random Sparsity
- Butterfly and Monarch Matrices
- Oct 16
- HW2 Due at 9pm
- Oct 18
-
An Overview of LLM Training, Finetuning, and Inference
- Scaling Laws
- Zero-short, Few-shot, Emergent Abilities
- Instruction Following Models
- RLHF-RLAIF-Constitutional AI
Week 5
- Oct 21
-
Parameter-Efficient Finetuning
- Parameter-Efficient Finetuning
- Oct 21
- Project Proposals Due at 9PM
- Oct 25
-
Data in AI Pipelines
- Oct 25
- Project Proposal Grades and Feedback Released, Project Mentors Assigned
Week 6
- Oct 28
-
Alternate Architectures to Transformers: Linear Attention
- Efficiency properties of convolutions vs. RNNs vs. Transformers
- State-space models and other subquadratic models
- Fast fourier transforms: efficient algorithms and hardware implementation
- Course Project Intro
- Nov 1
-
Alternate Architectures to Transformers: State Space Models, SSM Convolutions
Week 7
- Nov 04
-
LLM Serving Efficiency
- Nov 08
-
Sparse Mixture-of-Experts
Week 8
- Nov 11
-
- Guest LectureDylan Patel
- SemiAnalysis
- Nov 11
- Milestone Progress Reports Due at 9pm
- Nov 15
-
Parallelism
Week 9
Week 10
- Dec 02
-
Efficient Retrieval Systems
- Dec 06
-
Poster Session
Final Project Presentation
- Noon-3PM AT&T Patio, Gates
- Dec 06
- Final Project Reports Due at 9pm