Enhancing Image Restoration through Hybrid VQVAE and Transformer Architectures: A Novel Approach
Abstract
Image restoration is essential for enhancing the quality and utility of degraded images in various applications such as autonomous driving and robotics. Traditional convolutional neural network (CNN) approaches, while effective, often fail to adequately capture long-range dependencies in images, which is crucial for handling complex degradations such as low-light, underwater, and hazy conditions. Recent advancements have introduced the use of transformer architectures that excel in managing non-local information but struggle with localized detail retention and interference from irrelevant features. This paper presents a novel hybrid approach that integrates vector-quantized variational autoencoders (VQVAE) with transformer architectures to leverage the strengths of both methods. Our approach initiates with a context-enriched VQVAE to establish a discrete codebook that accurately captures essential visual features. This is followed by a transformer model that utilizes these codewords to restore image quality through an auto-regressive prediction framework. The proposed method not only advances the state-of-the-art in image restoration but also opens new avenues for applying hybrid deep learning architectures to complex image processing tasks.
Subject
Auto-Regressive Prediction
Hybrid Architectures
Image Restoration
Transformers
Vector-Quantized Variational Autoencoders (VQVAE)
Permanent Link
http://digital.library.wisc.edu/1793/96358Type
Book chapter
Citation
Xue, Y., Chen, X., & He, Y. (2025). Enhancing Image Restoration through Hybrid VQVAE and Transformer Architectures: A Novel Approach. In ASIG ’24: Proceedings of the 2024 2nd Asia Symposium on Image and Graphics (pp. 125–129). New York, N.Y.: Association for Computing Machinery. https://doi.org/10.1145/3718441.3718460

