Skip to main content Link Search Menu Expand Document (external link)

Calendar

Week 1

Sep 23
Introduction
  • Motivation
  • Course Overview
  • Logistics
Sep 27
Transformer Architecture
  • Sequence Modeling
  • RNNs
  • Transformers
  • Pretraining and Fine-Tuning
Sep 27
HW1 Released

Week 2

Sep 30
Hardware Aware Algorithm Design
  • Introduction to Compilers and the GPU Memory Hierarchy
  • Introduction to Arithmetic Intensity and Measures of Efficiency
Oct 04
Analyzing Transformer Performance
  • Measuring the FLOPs of MLP and Transformer Training and Inference (Intro to Backpropagation)
  • Reviewing Autoregressive Generation and KV Caching
  • Measuring the Efficiency of KV Caching (FLOPs, Arithmetic Intensity)
  • Speculative Decoding

Week 3

Oct 07
CUDA and GPU Programming for AI
Oct 07
HW1 Due at 9pm, HW2 Released
Oct 10
HW1 Grades and Solutions Released
Oct 11
Efficient Attention
  • Attention Bottlenecks and Attention Approximations to Improve Efficiency
  • I/O Aware Algorithms: FlashAttention

Week 4

Oct 14
Quantization, Sparsity, and Pruning
  • Structured Sparsity vs. Random Sparsity
  • Butterfly and Monarch Matrices
Oct 16
HW2 Due at 9pm
Oct 18
An Overview of LLM Training, Finetuning, and Inference
  • Scaling Laws
  • Zero-short, Few-shot, Emergent Abilities
  • Instruction Following Models
  • RLHF-RLAIF-Constitutional AI

Week 5

Oct 21
Parameter-Efficient Finetuning
  • Parameter-Efficient Finetuning
Oct 21
Project Proposals Due at 9PM
Oct 25
Data in AI Pipelines
  • Data in AI Pipelines
Oct 25
Project Proposal Grades and Feedback Released, Project Mentors Assigned

Week 6

Oct 28
Alternate Architectures to Transformers: Linear Attention
  • Efficiency properties of convolutions vs. RNNs vs. Transformers
  • State-space models and other subquadratic models
  • Fast fourier transforms: efficient algorithms and hardware implementation
  • Course Project Intro
Nov 1
Alternate Architectures to Transformers: State Space Models, SSM Convolutions

Week 7

Nov 04
LLM Serving Efficiency
Nov 08
Sparse Mixture-of-Experts

Week 8

Nov 11
Guest LectureDylan Patel
SemiAnalysis
Nov 11
Milestone Progress Reports Due at 9pm
Nov 15
Parallelism

Week 9

Nov 18
Cluster Scheduling
Nov 22
Guest LectureAlbert Gu
CMU, Cartesia AI
Nov 25
HolidayThanksgiving Break
Nov 29
HolidayThanksgiving Break

Week 10

Dec 02
Efficient Retrieval Systems
Dec 06
Poster Session Final Project Presentation
  • Noon-3PM AT&T Patio, Gates
Dec 06
Final Project Reports Due at 9pm