nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2025, 09, v.35 55-63
基于快速傅里叶卷积和掩码注意力的图像修复算法
基金项目(Foundation): 金华公益性技术应用研究项目(2022-4-060)
邮箱(Email):
DOI: 10.20165/j.cnki.ISSN1673-629X.2025.0115
摘要:

随着深度神经网络的迅速发展,图像修复技术取得了重要突破,在诸多领域得到了广泛应用。通常的图像修复方法采取不断增加网络层数或利用膨胀卷积来达到扩大感受野的目的,这可能会导致网络训练困难、参数量和计算量增加、修复结果容易产生伪影等问题。因此,设计了一种基于快速傅里叶卷积模块和空间掩码注意力机制的图像修复算法,包括图像修复主分支和滤波器预测辅助分支。首先,设计了一种快速傅里叶卷积模块将图像转换到频域进行处理,这使得图像中的噪声和纹理细节更加显著。其次,在空域上对图像进行处理,将破损图像的上下文信息和经编码器处理后的特征进行结合,再采用可学习的空间信息传播操作依次在两个不同空间维度上提取特征,提出了一个空间掩码注意力模块。另外,构建了一个轻量的滤波器预测网络,得到的两个滤波器用于图像修复主分支,对提取的特征和修复结果进行逐像素滤波,强化了边缘细节并进一步去除了噪声。最后,在CelebA、Paris Street View以及Places2三个公开数据集上与目前先进的修复方法进行了比较,实验结果表明,该方法能够取得更好的性能和视觉效果。

Abstract:

With the rapid development of deep neural networks, image inpainting technology has made significant break-throughs and has been widely applied in many fields. The existing image inpainting methods usually involve continuously increasing the number of network layers or using dilated convolutions to expand the receptive field, which may lead to difficulties in network training, increased parameter and computational complexity, and artifacts in the inpainting results. Therefore, an image inpainting algorithm based on fast Fourier convolution module and spatial mask attention mechanism is designed, including an image inpainting main branch and a filter prediction auxiliary branch. Firstly, a fast Fourier convolution module is designed to transform the damaged image into the frequency domain for processing, which makes the noise and texture details in the image more salient. Secondly, a spatial mask attention module is proposed. The damaged image is processed in the spatial domain by combining the contextual information with the features processed by the encoder. Then, a learnable spatial information propagation operation is used to extract features in two different spatial dimensions. Thirdly, a lightweight filter prediction network is constructed, and the two filters obtained are used for the image inpainting main branch. The extracted features and inpainting results are filtered pixel by pixel, enhancing edge details and further removing noise. Finally, the proposed method is compared with state-of-the-art inpainting methods on three publicly available datasets: CelebA,Paris Street View, and Places2. The experimental results show that the proposed method achieves better performance and visual effects.

参考文献

[1] LI Kangshun,WEI Yunshan,YANG Zhen,et al.Image inpainting algorithm based on TV model and evolutionary algorithm [J].Soft Computing,2016,20(3):885-893.

[2] LI Haodong,LUO Weiqi,HUANG Jiwu.Localization of diffusion-based inpainting in digital images[J].IEEE Trans Information Forensics and Security,2017,12(12):3050-3064.

[3] DARABI S,SHECHTMAN E,BARNES C,et al.Image melding:combining inconsistent images using patch-based synthesis[J].ACM Transactions on Graphics,2012,31(4):82:1-82:10.

[4] MIRZA M,OSINDERO S.Conditional generative adversarial nets[J/OL].2024-01-01.https://doi.org/10.48550/arXiv.1411.1784.

[5] HONG Yongjun,HWANG U,YOO J,et al.How generative adversarial networks and their variants work:an overview[J].ACM Comput Surv,2019,52(1):10:1-10:43.

[6] LIU Ziwei,LUO Ping,WANG Xiaogang,et al.Deep learning face attributes in the wild[J].arXiv.1411.7766,2014.

[7] DOERSCH C,SINGH S,GUPTA A,et al.What makes Paris look like Paris?[J].ACM Transactions on Graphics,2012,31(4):101.1-101.9.

[8] ZHOU Bolei,AGATA L,ADITYA K,et al.Places:a 10 million image database for scene recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(6):1452-1464.

[9] CHEN Yizhen,HU Haifeng.An improved method for semantic image inpainting with GANs:progressive inpainting[J].Neural Processing Letters,2019,49(3):1355-1367.

[10] LI Jianwu,SONG Ge,ZHANG Minhua.Occluded offline handwritten Chinese character recognition using deep convolutional generative adversarial network and improved GoogLeNet[J].Neural Computing and Applications,2018,32(9):1-15.

[11] SAGONG M,SHIN Y,KIM S,et al.PEPSI:fast image inpainting with parallel decoding network[C]//2019 IEEE/CVF conference on computer vision and pattern recognition.Los Angeles:IEEE,2020:11360-11368.

[12] SHIN Y,SAGONG M,YEO Y,et al.PEPSI++:fast and lightweight network for image inpainting[J].IEEE Transactions on Neural Networks and Learning Systems,2021,32(1):252-265.

[13] DHAMO H,TATENO K,LAINA I,et al.Peeking behind objects:layered depth prediction from a single image[J].Pattern Recognition Letters,2019,125:333-340.

[14] VITORIA P,SINTES J,BALLESTER C.Semantic image inpainting through improved wasserstein generative adversarial networks[C]//14th international conference on computer vision theory and applications.Prague:VISIGRAPP,2019:249-260.

[15] DONG Junyu,YIN Ruiying,SUN Xin,et al.Inpainting of remote sensing SST images with deep convolutional generative adversarial network[J].IEEE Geoscience and Remote Sensing Letters,2019,16(2):173-177.

[16] LOU Shenlong,FAN Qiancong,CHEN Feng,et al.Preliminary investigation on single remote sensing image inpainting through a modified GAN[C]//10th IAPR workshop on pattern recognition in remote sensing.Beijing:IEEE,2018:1-6.

[17] SALEM N M,MAHDI H M K,ABBAS H.Semantic image inpainting using self-learning encoder-decoder and adversarial loss[C]//The 13th IEEE international conference on computer engineering and systems.Cairo:IEEE,2018:103-108.

[18] 孙全,曾晓勤.基于生成对抗网络的图像修复[J].计算机科学,2018,45(12):229-234.

[19] SUVOROV R,LOGACHEVA E,MASHIKHIN A,et al.Resolution-robust large mask inpainting with Fourier convolutions[C]//2022 IEEE/CVF winter conference on applications of computer vision.Hawaii:IEEE,2022:3172-3182.

[20] YU Wangbo,DU Jinhao,LIU Ruixin,et al.Interactive image inpainting using semantic guidance[C]//2022 26th international conference on pattern recognition.Montreal:ICPR,2022:168-174.

[21] LIU Hongyu,JIANG Bin,SONG Yibing,et al.Rethinking image inpainting via a mutual encoder-decoder with feature equalizations[C]//17th European conference on computer vision.Vedaldi:[s.n.],2020.

[22] WANG Yi,CHEN Yingcong,TAO Xin,et al.VCNet:a robust approach to blind image inpainting[C]//17th European conference on computer vision.Vedaldi:[s.n.],2020.

[23] ZENG Yu,LIN Zhe,YANG Jimei,et al.High-resolution image inpainting with iterative confidence feedback and guided upsampling[C]//17th European conference on computer vision.Vedaldi:[s.n.],2020.

[24] LIU Qiankun,TAN Zhentao,CHEN Dongdong,et al.Reduce information loss in transformers for pluralistic image inpainting[C]//2022 IEEE/CVF conference on computer vision and pattern recognition.New Orleans:IEEE,2022:11337-11347.

[25] 杨荟聪.基于生成对抗网络与注意力机制的图像修复算法研究[J].计算机视觉与人工智能,2023,25(5):34-49.

[26] 刘露.基于深度学习的不规则掩码图像修复算法研究[J].计算机图形学与图像处理,2024,42(2):101-115.

[27] 胡文松.基于生成对抗网络的图像修复算法研究[J].人工智能与模式识别,2024,36(4):78-92.

[28] 滕诗宇.融合多特征与全局-局部Transformer的图像修复算法[J].计算机视觉与模式识别学报,2025,28(3):45-60.

[29] LI Xiaoguang,GUO Qing,LIN Di,et al.MISF:multi-level interactive siamese filtering for high-fidelity image inpainting[C]//2022 IEEE/CVF conference on computer vision and pattern recognition.New Orleans:IEEE,2022:1859-1868.

[30] ISOLA P,ZHU Junyan,ZHOU Tinghui,et al.Image-to-image translation with conditional adversarial networks[C]//2017 IEEE/CVF conference on computer vision and pattern recognition.Hawaii:IEEE,2017:5967-5976.

[31] JOHNSON J,ALAHI A,LI Feifei.Perceptual losses for real-time style transfer and super-resolution[J].abs/1603.08155,2016.

[32] GATYS L A,ECKER A S,BETHGE M.Image style transfer using convolutional neural networks[C]//2016 IEEE conference on computer vision and pattern recognition.Las Vegas:IEEE,2016:2414-2423.

[33] LIU Guilin,REDA F A,SHIH K J,et al.Image inpainting for irregular holes using partial convolutions[C]//The 15th European conference on computer vision.Munich:Springer,2018:89-105.

[34] ZENG Yanhong,FU Jianlong,CHAO Hongyang,et al.Aggregated contextual transformations for high-resolution image inpainting[J].IEEE Transactions on Visualization and Computer Graphics,2023,29(7):3266-3280.

基本信息:

DOI:10.20165/j.cnki.ISSN1673-629X.2025.0115

中图分类号:TP391.41;TP183

引用信息:

[1]张钊源,曾志高,谢峥嵘等.基于快速傅里叶卷积和掩码注意力的图像修复算法[J].计算机技术与发展,2025,35(09):55-63.DOI:10.20165/j.cnki.ISSN1673-629X.2025.0115.

基金信息:

金华公益性技术应用研究项目(2022-4-060)

检 索 高级检索

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文