FIPER: Factorized Features for Robust Image Super-Resolution and Compression
Abstract
In this work, we propose using a unified representation, termed Factorized Features, for low-level vision tasks, where we test on Single Image Super-Resolution (SISR) and Image Compression. Motivated by the shared principles between these tasks, they require recovering and preserving fine image details, whether by enhancing resolution for SISR or reconstructing compressed data for Image Compression. Unlike previous methods that mainly focus on network architecture, our proposed approach utilizes a basis-coefficient decomposition as well as an explicit formulation of frequencies to capture structural components and multi-scale visual features in images, which addresses the core challenges of both tasks. We replace the representation of prior models from simple feature maps with Factorized Features to validate the potential for broad generalizability. In addition, we further optimize the compression pipeline by leveraging the mergeable-basis property of our Factorized Features, which consolidates shared structures on multi-frame compression. Extensive experiments show that our unified representation delivers state-of-the-art performance, achieving an average relative improvement of 204.4% in PSNR over the baseline in Super-Resolution (SR) and 9.35% BD-rate reduction in Image Compression compared to the previous SOTA.
TL;DR: An unified image representation to reconstruct the fine details.
Comparison between decomposition-based, architecture-based, and our Factorized Features representation
The full Factorized Features formulation
Super-Resolution and Image Compression with Factorized Features
(b) For Image Compression, the synthesis transform of a learned compression model is replaced with our SR module. This design leverages SR priors and the structured basis–coefficient representation to restore details more effectively after quantization, reducing distortion at comparable bitrates.
Visual comparisons on super-resolution (4×)
Performance (RD-Curve) evaluation on image compression using different datasets
Acknowledgements
This research was funded by the National Science and Technology Council, Taiwan, under Grants NSTC
112-2222-E-A49-004-MY2 and 113-2628-E-A49-023-. The authors are grateful to Google, NVIDIA, and
MediaTek Inc. for generous donations. Yu-Lun Liu acknowledges the Yushan Young Fellow Program by the
MOE in Taiwan.