Seminar on long-context language models

CMSC 848O, Spring 2025, UMD CS
Tues/Thurs 12:30-1:45 PM in CSI 2117
See schedule for all videos, readings, and assignments.

Instructor: Mohit Iyyer
TA: Chau Pham
Email (to both of us): longcontextseminar@gmail.com
Office hours: Thurs 2-3PM, IRB 4142
Links:

Course objectives: The primary goal of this seminar is to discuss recent developments in long-context language modeling. More specifically, we will critically examine advances in the training, alignment, and evaluation of long-context language models, which have allowed cutting-edge LLMs to process and generate millions of words. In addition to the course material, students will also learn how to read and evaluate research papers, as well as how to formulate research problems in this space and develop solutions for them. The course is intended for graduate students who are interested in NLP/LLM research.

Class format: This course will consist primarily of student-led discussions of recent research papers. The first 3 weeks of the semester will be lectures delivered by the instructor to establish background about Transformer language models. Then, we will move to student-led presentations, in which 1-2 research papers will be discussed per class and a subset of students will be assigned different roles (e.g., presenter, reviewer, archaelogist) to perform during the discussion. All students will submit questions about an assigned papers at least one day prior to its corresponding class.

Expectations: Attendance is mandatory, and active participation is expected: you will get to know your classmates as you go through the struggle to understand complex papers together. Students can expect to carefully read 2-4 papers per week, and each student can be expected to present an assigned paper 2-3 times during the semester. Additionally, there will be two small-scale writing assignments as well an exam towards the end of the semester, all of which cover topics from the presented papers.