Under limited bandwidth conditions, video transmission often employs lossy compression to reduce the data volume, inevitably introducing compression noise. Quality enhancement of compressed videos can effectively recover the information loss incurred during the compression process. Currently, multi-frame quality enhancement of compressed videos has shown performance advantages compared to single-frame methods, as it utilizes the temporal correlation of videos. Methods based on deformable convolutions obtain spatio-temporal fusion features for reconstruction through multi-frame alignment. However, due to the limited utilization of deep information and sensitivity to alignment accuracy, these methods yield suboptimal results, especially in scenarios with scene changes and intense motion. To overcome these limitations, we propose a dense network-based quality enhancement method to obtain more accurate spatio-temporal fusion features. Specifically, the deep spatial features are first extracted from the to-be-enhanced frames using dense connections, then combined with the aligned features obtained from deformable convolution through the convolution and attention mechanism to make the network more attentive to useful branches in an adaptive way, and finally, the enhanced frames are obtained through the quality enhancement module of the dense connection structure. The experimental results show that when the quantization parameter is 37, the proposed method can improve the average peak signal-to-noise ratio by 0.99 dB in the lowdelay_P configuration. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Video
Video compression
Feature fusion
Image enhancement
Feature extraction
Image quality
Convolution