Calendar

Sep 23

Introduction

Motivation
Course Overview
Logistics

Sep 27

Transformer Architecture

Sequence Modeling
RNNs
Transformers
Pretraining and Fine-Tuning

Sep 27

HW1 Released

Sep 30

Hardware Aware Algorithm Design

Introduction to Compilers and the GPU Memory Hierarchy
Introduction to Arithmetic Intensity and Measures of Efficiency

Oct 04

Analyzing Transformer Performance

Measuring the FLOPs of MLP and Transformer Training and Inference (Intro to Backpropagation)
Reviewing Autoregressive Generation and KV Caching
Measuring the Efficiency of KV Caching (FLOPs, Arithmetic Intensity)
Speculative Decoding

Oct 07

CUDA and GPU Programming for AI

Oct 07

HW1 Due at 9pm, HW2 Released

Oct 10

HW1 Grades and Solutions Released

Oct 11

Efficient Attention

Attention Bottlenecks and Attention Approximations to Improve Efficiency
I/O Aware Algorithms: FlashAttention

Oct 14

Quantization, Sparsity, and Pruning

Structured Sparsity vs. Random Sparsity
Butterfly and Monarch Matrices

Oct 16

HW2 Due at 9pm

Oct 18

An Overview of LLM Training, Finetuning, and Inference

Scaling Laws
Zero-short, Few-shot, Emergent Abilities
Instruction Following Models
RLHF-RLAIF-Constitutional AI

Oct 21

Parameter-Efficient Finetuning

Parameter-Efficient Finetuning

Oct 21

Project Proposals Due at 9PM

Oct 25

Data in AI Pipelines

Data in AI Pipelines

Oct 25

Project Proposal Grades and Feedback Released, Project Mentors Assigned

Oct 28

Alternate Architectures to Transformers: Linear Attention

Efficiency properties of convolutions vs. RNNs vs. Transformers
State-space models and other subquadratic models
Fast fourier transforms: efficient algorithms and hardware implementation
Course Project Intro

Nov 1

Alternate Architectures to Transformers: State Space Models, SSM Convolutions

Nov 04: LLM Serving Efficiency
Nov 08: Sparse Mixture-of-Experts

Nov 11

Guest LectureDylan Patel: SemiAnalysis

Nov 11

Milestone Progress Reports Due at 9pm

Nov 15

Parallelism

Nov 18

Cluster Scheduling

Nov 22

Guest LectureAlbert Gu: CMU, Cartesia AI

Nov 25

HolidayThanksgiving Break

Nov 29

HolidayThanksgiving Break

Dec 02

Efficient Retrieval Systems

Dec 06

Poster Session Final Project Presentation

Noon-3PM AT&T Patio, Gates

Dec 09

Final Project Reports Due Dec 9 at 9pm

Calendar

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10