Tutorial

Full-Stack, GPU-based Acceleration of Deep Learning and Foundation Models

Jason Clemons, Hongxu (Danny) Yin, and Xinglong Sun

2025 Tutorial

Project Page

Abstract

This tutorial offers insights across the hardware-software stack to accelerate deep neural networks, from convolutions to multimodal LLMs. Attendees will learn practical tools and trade-offs to optimize performance and inspire the next generation of scalable acceleration techniques.

Chat is not available.