OMSCS Seminar: System and Architecture Concepts for ML and Large Language Models

Overview
Course Content
Requirements & Materials

COURSE ID: COMP 1005P

Contact for course-related questions

Ana Rusch

Instructors

Ana Rusch

OMSCS Seminar: System and Architecture Concepts for ML and Large Language Models

Course Description

This seminar introduces system-level concepts underpinning modern ML and LLM workloads. Topics include OS-level scheduling and kernel resource management, accelerators and GPU architectures optimizations, memory hierarchies, storage solutions, virtualization, and network designs optimized for large-scale data processing. The course emphasizes conceptual understanding through readings and discussions, with no prior programming or systems background required.

Week 1:
Introduction to LLMs and Seminar Overview
– Seminar logistics and expectations
– High-level concepts behind LLMs and system-level impacts

Week 2:
Reading Week – “Attention Is All You Need” (NeurIPS 2017)
– Origins of the Transformer compute pattern
– Foundational architecture for modern LLMs

Week 3:
Discussion: BERT (NAACL 2019) + Week 2 Reading
– Pre-training and workload evolution
– How model scale changes system and resource needs

Week 4:
Reading: "Mind the Memory Gap" (pre-publication/working paper)
– OS-level and hardware implications of memory hierarchies in ML workloads

Week 5:
Discussion: ZeRO-Infinity (SC 2021) + Week 4 Reading
– Mitigating GPU memory bottlenecks in large model training
– Partitioning strategies and offloading techniques

Week 6:
Reading: PagedAttention / vLLM (SOSP 2023)
– Memory-efficient inference for LLMs
– Virtual memory and runtime paging for transformers

Week 7:
Discussion: CacheGen (SIGCOMM 2024) + Week 6 Reading
– How memory management innovations impact LLM performance
– Compiler/runtime co-design for caching

Week 8:
Reading: MLPerf Benchmark Suite
– Standardized benchmarks for ML training and inference
– Overview of workloads and performance metrics

Week 9:
Discussion: MLPerf + Week 8 Reading
– Trade-offs in system design using benchmarked data
– Latency, throughput, power efficiency

Week 10:
Reading: FlashAttention
– Algorithmic improvements for attention layers
– Efficient memory usage during training and inference

Week 11:
Discussion: Tentative + Week 10 Reading
– FlashAttention vs. standard attention mechanisms
– The role of custom CUDA kernels and hardware optimizations

Week 12:
Reading: QoS-Efficient Serving of Multiple MoE LLMs Using Partial Runtime Reconfiguration
– Serving multiple mixture-of-expert models efficiently
– QoS-aware scheduling and reconfiguration

Week 13:
Discussion: Synergy (OSDI 2022) + Week 12 Reading
– Resource management in multi-tenant clusters
– OS/hypervisor techniques for optimizing ML workloads

Week 14:
Reading: ALISA (ISCA 2024)
– Accelerator-aware load balancing and scheduling
– Co-design of datacenter infrastructure for AI workloads

Week 15:
Discussion: Tentative + Week 14 Reading
– Datacenter architectures, accelerators, and power-aware scheduling
– Wrap-up of system-level trends in ML and LLMs

Week 16:
Buffer/Make-Up Week
– Reserved for catch-up in case of holidays or conference conflicts

REVISION by RJM ...

Weeks 1 and 2
Introduction to LLMs and Seminar Overview
– Seminar logistics and expectations
– High-level concepts behind LLMs and system-level impacts

Reading Week – “Attention Is All You Need” (NeurIPS 2017)
– Origins of the Transformer compute pattern
– Foundational architecture for modern LLMs

Weeks 3 and 4:
Discussion: BERT (NAACL 2019) and Week 2 Reading
– Pre-training and workload evolution
– How model scale changes system and resource needs

Reading: "Mind the Memory Gap" (Pre-Publication/Working Paper)
– OS-level and hardware implications of memory hierarchies in ML workloads

Weeks 5 and 6:
Discussion: ZeRO-Infinity (SC 2021) and Week 4 Reading
– Mitigating GPU memory bottlenecks in large model training
– Partitioning strategies and offloading techniques

Reading: PagedAttention/vLLM (SOSP 2023)
– Memory-efficient inference for LLMs
– Virtual memory and runtime paging for transformers

Weeks 7 and 8
Discussion: CacheGen (SIGCOMM 2024) and Week 6 Reading
– How memory management innovations impact LLM performance
– Compiler/runtime co-design for caching

Reading: MLPerf Benchmark Suite
– Standardized benchmarks for ML training and inference
– Overview of workloads and performance metrics

Weeks 9 and 10
Discussion: MLPerf and Week 8 Reading
– Trade-offs in system design using benchmarked data
– Latency, throughput, power efficiency

Reading: FlashAttention
– Algorithmic improvements for attention layers
– Efficient memory usage during training and inference

Weeks 11 and 12
Discussion: Tentative and Week 10 Reading
– FlashAttention vs. standard attention mechanisms
– The role of custom CUDA kernels and hardware optimizations

Reading: QoS-Efficient Serving of Multiple MoE LLMs Using Partial Runtime Reconfiguration
– Serving multiple mixture-of-expert models efficiently
– QoS-aware scheduling and reconfiguration

Weeks 13 and 14
Discussion: Synergy (OSDI 2022) and Week 12 Reading
– Resource management in multi-tenant clusters
– OS/hypervisor techniques for optimizing ML workloads

Reading: ALISA (ISCA 2024)
– Accelerator-aware load balancing and scheduling
– Co-design of data center infrastructure for AI workloads

Weeks 15 and 16
Discussion: Tentative and Week 14 Reading
– Data center architectures, accelerators, and power-aware scheduling
– Wrap-up of system-level trends in ML and LLMs

Buffer/Make-Up Week
– Reserved for catch-up in case of holidays or conference conflicts

Materials

PROVIDED (Student will receive):

All content is available in Canvas.

Who Should Attend

This seminar is designed for OMSCS students and alumni interested in the system-level foundations that support machine learning and large language models.

Computer science students coding on computers

What You Will Learn

The role of operating systems, kernels, and schedulers in ML/LLM workloads
GPU and accelerator architectures and how they affect performance
How to read and interpret systems research papers and white papers

Female professional in computer science lab looking at tablet

How You Will Benefit

Gain a foundational understanding of the hardware and software systems behind ML and LLMs.
Learn to evaluate architectural trade-offs when scaling or deploying models.
Learn to read and critically assess systems-focused academic and industry publications.
Evaluate trade-offs and have a general idea of efficient, secure, scalable, and energy-efficient architectures for ML, DL, and LLM applications.
Gain insights on performance bottlenecks, OS and hardware interactions, container orchestration, and resource scheduling via white papers and conference publications.

Grow Your Professional Network
Taught by Experts in the Field

Participants in GTPE courses are required to complete an online profile that meets the requirements of Georgia Tech Research Security. Information collected is maintained in the Georgia Tech Student System. The following data elements are considered directory information and are collected from each participant as part of the registration and profile setup process:

Full legal name
Email address
Shipping address
Company name

This data is not published in Georgia Tech’s online directory system and therefore is not currently available to the general public. Learner information is used only as described in our Privacy Policy. GTPE data is not sold or provided to external entities.

Sensitive Data
The following data elements, if in the Georgia Tech Student Systems, are considered sensitive information and are only available to Georgia Tech employees with a business need-to-know:

Georgia Tech ID
Date of birth
Citizenship
Gender
Ethnicity
Religious preferences
Social security numbers
Registration information
Class schedules
Attendance records
Academic history

At any time, you can remove your consent to marketing emails as well as request to delete your personal data. Visit our GTPE EU GDPR page for more information.

Classes and events being held at the Georgia Tech Global Learning Center in Atlanta or Georgia Tech-Savannah campus may be impacted by closures or delays due to inclement weather.

The Georgia Tech Global Learning Center will follow the guidelines of Georgia Tech main campus in Atlanta. Students, guests, and instructors should check the Georgia Tech homepage for information regarding university closings or delayed openings due to inclement weather. Please be advised that if campus is closed for any reasons, all classroom courses are also canceled.

Students, guests, and instructors attending classes and events at Georgia Tech-Savannah should check the Georgia Tech-Savannah homepage for information regarding closings or delayed openings due to inclement weather.

Courses that are eligible for special discounts will be noted accordingly on the course page. Only one coupon code can be entered during the checkout process and cannot be redeemed after checkout is complete. If you have already registered and forgot to use your coupon code, you can request an eligible refund. GTPE will cancel any transaction where a coupon was misused or ineligible. If you are unsure if you can use your coupon code, please check with the course administrator.

GTPE does not have a program for senior citizens. However, Georgia Tech offers a 62 or Older Program for Georgia residents who are 62 or older and are interested in taking for credit courses. This program does not pay for noncredit professional education courses. Visit the Georgia Tech Undergraduate Admissions page for more information on the undergraduate program and Georgia Tech Graduate Admissions page for more information on the graduate program.

Complete a GTPE profile.
Shop for a course.
Add the course(s) to the cart.*
Apply a special discount code (if applicable).
Provide an accepted payment method to complete the order (credit card, third party credit card holder, or one accepted payment document).

*Carts will remain active for 14 days, but seats are not held until the transaction is complete.

If available, discounts will display on the course page or will be automatically applied during the purchase process. Only one coupon code should be entered during the checkout process and will be validated by the system if applicable to items in your cart. If you have already registered and forgot to use your coupon code, you can request an eligible refund.

GTPE does not have a discount program for senior citizens. However, Georgia Tech offers a 62 or Older Program for Georgia residents who are 62 or older and are interested in taking for credit courses. This program does not pay for noncredit professional education courses. Visit the Georgia Tech Undergraduate Admissions page for more information on the undergraduate program and the Georgia Tech Graduate Admissions page for more information on the graduate program.

GTPE does not have a program for senior citizens. However, Georgia Tech offers a 62 or Older Program for Georgia residents who are 62 or older and are interested in taking for credit courses. This program does not pay for noncredit professional education courses. Visit the Georgia Tech Undergraduate Admissions page for more information on the undergraduate program and the Georgia Tech Graduate Admissions page for more information on the graduate program.

Short courses (1-5 days) and conferences do not require a student visa. A B-2 Tourist Visa, along with a copy of your registration confirmation email and a copy of your completed web registration order page, should suffice.

If participation in a course is employment related, with immediate departure from the U.S., then a B-1 Temporary Business Visa will be required.

We encourage you to contact your U.S. Consulate or Embassy to determine visa eligibility. Full refunds will be provided to participants who are unable to obtain an entry visa and contact our office prior to the start of the course.

English as a Second Language students should contact the Language Institute for admission and visa requirements.

There is no special process or form to register your group. All interested learners must create and manage their own individual profiles, accounts, and registrations.

Complete a GTPE profile.
Shop for a course.
Add the course(s) to the cart.*
Apply a group discount code (if applicable).
Provide an accepted payment method to complete the order (credit card, third party credit card holder, or one accepted payment document).

*Carts will remain active for 14 days, but seats are not held until the transaction is complete.

Courses that offer group discounts will display the discount code on the course page. Your employees will use the code during the registration process and cart totals will adjust accordingly. Group discounts can only be used if three or more employees from the company attend the same course and only one coupon code can be use per shopping cart.

If you have already registered and forgot to use your coupon code, you can request an eligible refund.

Name of company and physical address
Name of employee(s) approved for training
Document number (SF-182 documents: Section C, Box 4)
Billing address (SF-182 documents: Section C, Box 6)
Course title and course dates
Maximum disbursement amount (billing amount)
Expiration date (if applicable)
Authorized signature(s)
Payment terms less than or equal to net 30

The employee can print of a copy of their shopping cart to submit if required for payment documents. The cart will remain active for 14 days, but the seat will not held until registration and payment is complete.

Registrations cannot be processed without payment. If your employee is concerned about losing a seat in a class because of internal company processes, we suggest that they go ahead and register and pay with a personal or corporate credit card and seek reimbursement.

Purchase order documents must include the following:

Name of company and physical address
Name of company’s financial contact and/or ap (accounts payable) email address
Name of employee(s) approved for training
Document number (SF-182 documents: Section C, Box 4)
Billing address (SF-182 documents: Section C, Box 6)
Course title and course dates
Maximum disbursement amount (billing amount)
Expiration date (if applicable)
Authorized signature(s)
Payment terms less than or equal to net 30

Please do not include social security numbers on purchase order documents.

Make your check payable to “Georgia Institute of Technology” and include the order number and participant name on the face of the check.
Choose “Company Purchase Order” as the payment method at checkout and upload a copy of your check to your order.
Mail your check to:
GTPE Accounting
Georgia Institute of Technology
Global Learning Center
84 5th St. NW
Atlanta, GA 30308-1031

Yes. Here are the steps to receive an invoice:

Complete a GTPE profile.
Shop for a course.
Add the course(s) to the cart.
Print your cart and submit to your employer as the cost estimate.
Receive a copy of your company’s Purchase Order or payment approval document for GTPE to invoice against.
Return to your cart, proceed through checkout and upload our company PO in the final payment step.
The GTPE Business Office will generate an invoice 30 days prior to the start of the course at which point you are no longer eligible to withdraw with refund.

Your company must:

Abide by the Georgia Tech and Board of Regent’s business terms of net 30.
Pay the full balance of a Georgia Tech invoice (there are no discounts for payments made early or on time).
Pay the invoice if the employee fails to withdraw during the refund period and does not attend the course.

GTPE does not issue transcripts or certificates with Professional Develop Hours (PDH) or Professional Development Units (PDU), but the crosswalk here is provided for reference.

One CEU = 10 contact hours of instruction
One PDH = 1 contact hour of instruction (one CEU = 10 PDH)
One PDU = 1 contact hour of instruction (one CEU = 10 PDU)

TRAIN AT YOUR LOCATION

We enable employers to provide specialized, on-location training on their own timetables. Our world-renowned experts can create unique content that meets your employees' specific needs. We also have the ability to deliver courses via web conferencing or on-demand online videos. For 15 or more students, it is more cost-effective for us to come to you.

Save Money
Flexible Schedule
Group Training
Customize Content
On-Site Training
Earn a Certificate

Learn More

OMSCS Seminar: System and Architecture Concepts for ML and Large Language Models

COURSE ID: COMP 1005P

OMSCS Seminar: System and Architecture Concepts for ML and Large Language Models

Course Description

Materials

Who Should Attend

What You Will Learn

How You Will Benefit

Grow Your Professional Network

Taught by Experts in the Field

Policies

Individual Registrations

Group Registrations

Payments

Transcripts, Certificates, and Credits

Withdrawals, Substitutions, and Transfers

TRAIN AT YOUR LOCATION

Save Money

Flexible Schedule

Group Training

Customize Content

On-Site Training

Earn a Certificate

Want to learn more about this course?