CVPR Poster MEET: Towards Memory-Efficient Temporal Sparse Deep Neural Networks

Poster

MEET: Towards Memory-Efficient Temporal Sparse Deep Neural Networks

Zeqi Zhu · Ibrahim Batuhan Akkaya · Luc Waeijen · Egor Bondarev · Arash Pourtaherian · Orlando Moreira

ExHall D Poster #302

[ Abstract ]

Sun 15 Jun 2 p.m. PDT — 4 p.m. PDT

Abstract: Deep Neural Networks (DNNs) are accurate but compute-intensive, leading to substantial energy consumption during inference. Exploiting temporal redundancy through

Δ

$\Delta$ -

Σ

$\Sigma$ convolution in video processing has proven to greatly enhance computation efficiency. However, temporal

Δ

$\Delta$ -

Σ

$\Sigma$ DNNs typically require substantial memory for storing neuron states to compute inter-frame differences, hindering their on-chip deployment. To mitigate this memory cost, directly compressing the states can disrupt the linearity of temporal

Δ

$\Delta$ -

Σ

$\Sigma$ convolution, causing accumulated errors in long-term

Δ

$\Delta$ -

Σ

$\Sigma$ processing. Thus, we propose

MEET

$\textbf{MEET}$ , an optimization framework for

ME

$\textbf{ME}$ mory-

E

$\textbf{E}$ fficient

T

$\textbf{T}$ emporal

Δ

$\Delta$ -

Σ

$\Sigma$ DNNs. MEET transfers the state compression challenge to a well-established weight compression problem by trading fewer activations for more weights and introduces a co-design of network architecture and suppression method to optimize for mixed spatial-temporal execution. Evaluations on three vision applications demonstrate a reduction of 5.1

\sim

$\sim$ 13.3

\times

$\times$ in total memory compared to the most computation-efficient temporal DNNs, while preserving the computation efficiency and model accuracy in long-term

Δ

$\Delta$ -

Σ

$\Sigma$ processing. MEET facilitates the deployment of temporal

Δ

$\Delta$ -

Σ

$\Sigma$ DNNs within on-chip memory of embedded event-driven platforms, empowering low-power edge processing.

Live content is unavailable. Log in and register to view live content