SenCache: Accelerating Diffusion Model Inference via Sensitivity-Aware Caching
Yasaman Haghighi ⋅ Alex Alahi
Abstract
Diffusion models achieve state-of-the-art video generation but their many sequential denoising steps create a major computational bottleneck. Existing acceleration methods reuse cached model outputs at fixed timesteps chosen through heuristics, requiring heavy tuning and failing to adapt to each sample’s complexity. We address this with a principled, sensitivity-aware caching framework. We first formalize the caching problem by analyzing the network's output sensitivity with respect to changes in its inputs—namely, the noisy latent and the timestep. We demonstrate that this sensitivity is the key indicator of caching error. Building on this insight, we introduce Sensitivity-Aware Caching ($\text{SenCache}$), a dynamic strategy that adaptively selects which timesteps to cache on a per-sample basis. This allows for less caching on challenging samples and more aggressive acceleration on simpler ones. Our method provides a robust theoretical grounding for adaptive caching, offering an explanation for why previous empirical criteria are partially effective and extending them with a dynamic, sample-specific approach. Experiments on Wan 2.1, CogVideoX and LTX-Video models demonstrate that our method outperforms existing caching strategies in visual quality under similar computational budgets.
Successful Page Load