Poster

CFAT: Unleashing Triangular Windows for Image Super-resolution

Abhisek Ray ⋅ Gaurav Kumar ⋅ Maheshkumar Kolekar

Highlight

2024 Poster

[ Slides] [ Poster] [Paper PDF]

Abstract

Transformer-based models have revolutionized the field of image super-resolution by harnessing their inherent ability to capture complex contextual features. The overlapping rectangular shifted window technique used in transformer architecture nowadays is a common practice in super-resolution models to improve the quality and robustness of image upscaling. However, it suffers from distortion at the boundaries and has limited unique shifting modes. To overcome these weaknesses, we propose an overlapping triangular window technique that synchronously works with the rectangular one to reduce boundary-level distortion and allow the model to access more unique sifting modes. In this paper, we propose a Composite Fusion Attention Transformer (CFAT) that incorporates triangular-rectangular window-based local attention with a channel-based global attention technique in image super-resolution. As a result, CFAT enables attention mechanisms to be activated on more image pixels and captures long-range, multi-scale features to improve SR performance. The extensive experimental results and ablation study demonstrate the effectiveness of CFAT in the SR domain. Our proposed model shows a significant 0.7 dB performance improvement over other state-of-the-art SR architectures.

Chat is not available.