| 99 | 0 | 48 |
| 下载次数 | 被引频次 | 阅读次数 |
视频异常检测是计算机视觉领域的关键任务之一。传统的视频异常检测方法存在着在复杂场景下易受背景噪声干扰、难以有效捕捉局部细节,且易因过拟合训练数据而导致模型泛化能力差、鲁棒性不足的问题。针对这些挑战,该文提出了一种融合DropBlock与注意力机制的视频异常检测算法。该算法基于U-Net架构,在瓶颈层和跳跃连接中分别引入了SE模块(Squeeze-and-Excitation Module)和空间注意力模块(Spatial Attention Module),SE模块通过通道注意力机制增强重要通道的特征表示,而空间注意力模块则通过动态调整空间权重,提升对关键区域的关注。在SE模块后融合了Transformer,增强模型对视频时空特征的建模能力。同时,通过在卷积层中引入DropBlock,有效缓解了卷积网络的过拟合问题,增强了模型的泛化能力。实验结果表明,该方法在UCSD-Ped2、CUHK Avenue和ShanghaiTech公开数据集上的AUC指标分别达到96.9%、86.2%和73.1%,验证了其有效性。
Abstract:Video anomaly detection is one of the key tasks in the field of computer vision. Traditional video anomaly detection methods are susceptible to background noise interference, difficult to effectively capture local details, and prone to poor generalization ability and insufficient robustness due to overfitting training data. In order to solve these challenges, we propose a video anomaly detection algorithm that combines DropBlock and attention mechanism. Based on the U-Net architecture, the Squeeze-and-Excitation Module and the Spatial Attention Module are introduced into the bottleneck layer and the jump connection, respectively. The SE module enhances the feature representation of important channels through the channel attention mechanism, while the spatial attention module increases the focus on key regions by dynamically adjusting spatial weights. The Transformer is integrated after the SE module to enhance the model's ability to model the spatio-temporal features of videos. At the same time, by introducing DropBlock into the convolutional layer, the overfitting problem of the convolutional network is effectively alleviated and the generalization ability of the model is enhanced. The experimental results show that the AUC indexes of the proposed method on UCSD-Ped2,CUHK Avenue and ShanghaiTech public datasets reach 96.9%,86.2% and 73.1%,respectively, which verifies its effectiveness.
[1]邬开俊,黄涛,王迪聪,等.视频异常检测技术研究进展[J].计算机科学与探索,2022,16(3):529-540.
[2]何平,李刚,李慧斌.基于深度学习的视频异常检测方法综述[J].计算机工程与科学,2022,44(9):1620-1629.
[3]HASAN M,CHOI J,NEUMANN J,et al.Learning temporal regularity in video sequences[C]//Proceedings of the IEEEconference on computer vision and pattern recognition.Las Vegas:IEEE,2016:733-742.
[4]PARK H,NOH J,HAM B.Learning memory-guided normality for anomaly detection[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.Seattle:IEEE,2020:14372-14381.
[5]LIU W,LUO W,LIAN D,et al.Future frame prediction for anomaly detection-a new baseline[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.Salt Lake City:IEEE,2018:6536-6545.
[6]RAVANBAKHSH M,SANGINETO E,NABI M,et al.Training adversarial discriminators for cross-channel abnormal event detection in crow ds[C]//2019 IEEE w inter conference on applications of computer vision (WACV).Waikoloa:IEEE,2019:1896-1904.
[7]GHIASI G,LIN T Y,LE Q V.Dropblock:a regularization method for convolutional netw orks[J].Advances in Neural Information Processing Systems,2018,31:10750-10760.
[8]WOO S,PARK J,LEE J Y,et al.Cbam:convolutional block attention module[C]//Proceedings of the European conference on computer vision (ECCV).M unich:Springer,2018:3-19.
[9]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.Salt Lake City:IEEE,2018:7132-7141.
[10]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is w orth 16x16 w ords:Transformers for image recognition at scale[J].ar Xiv:2010.11929,2020.
[11]VELI CˇKOVI C'P,CUCURULL G,CASANOVA A,et al.Graph attention netw orks[J].ar Xiv:1710.10903,2017.
[12]ZHANG J,QI X,JI G.Self attention based bi-directional long short-term memory auto encoder for video anomaly detection[C]//2021 ninth international conference on advanced cloud and big data (CBD).Xi’an:IEEE,2022:107-112.
[13]WANG S,MIAO Z.Anomaly detection in crowd scene[C]//IEEE 10th international conference on signal processing proceedings.Beijing:IEEE,2010:1220-1223.
[14]LU C,SHI J,JIA J.Abnormal event detection at 150 fps in matlab[C]//Proceedings of the IEEE international conference on computer vision.Sydney:IEEE,2013:2720-2727.
[15]LUO W,LIU W,GAO S.A revisit of sparse coding based anomaly detection in stacked rnn framew ork[C]//Proceedings of the IEEE international conference on computer vision.Venice:IEEE,2017:341-349.
[16]LV H,CHEN C,CUI Z,et al.Learning normal dynamics in videos w ith meta prototype netw ork[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.Nashville:IEEE,2021:15425-15434.
[17]FAN Y,WEN G,LI D,et al.Video anomaly detection and localization via Gaussian mixture fully convolutional variational autoencoder[J].Computer Vision and Image Understanding,2020,195:102920.
[18]TANG Y,ZHAO L,ZHANG S,et al.Integrating prediction and reconstruction for anomaly detection[J].Pattern Recognition Letters,2020,129:123-130.
[19]WANG B,YANG C.Video anomaly detection based on convolutional recurrent autoencoder[J].Sensors,2022,22(12):4647.
[20]PARK C,CHO M A,LEE M,et al.Fast Ano:fast anomaly detection via spatio-temporal patch transformation[C]//Proceedings of the IEEE/CVF w inter conference on applications of computer vision.Waikoloa:IEEE,2022:2249-2259.
[21]ARIYANI S,YUNIARNO E M,PURNOMO M H.Multiperson key points detection for abnormal human behavior analysis using the Conv LSTM-AE method[C]//2022 international conference on computer engineering,netw ork,and intelligent multimedia (CENIM).Surabaya:IEEE,2022:1-7.
[22]AICH A,PENG K C,ROY-CHOWDHURY A K.Cross-domain video anomaly detection w ithout target domain adaptation[C]//Proceedings of the IEEE/CVF w inter conference on applications of computer vision.Waikoloa:IEEE,2023:2579-2591.
[23]SINGH R,SAINI K,SETHI A,et al.STem GAN:spatio-temporal generative adversarial netw ork for video anomaly detection[J].Applied Intelligence,2023,53 (23):28133-28152.
[24]ASLAM N,KOLEKAR M H.Trans GANomaly:transformer based generative adversarial netw ork for video anomaly detection[J].Journal of Visual Communication and Image Representation,2024,100:104108.
基本信息:
DOI:10.20165/j.cnki.ISSN1673-629X.2025.0251
中图分类号:TP391.41;TP18
引用信息:
[1]施圣卿,杨大为.融合DropBlock和注意力机制的视频异常检测算法[J].计算机技术与发展,2026,36(02):54-61.DOI:10.20165/j.cnki.ISSN1673-629X.2025.0251.
基金信息:
辽宁省自然科学基金面上项目(2022-MS-276)
2025-09-22
2025-09-22
2025-09-22