Paper
12 January 2023 ShiftMask: video behavior recognition data augmentation
Liang Tan, Renze Luo, Renquan Luo, Hong Yu, Zhilin Deng
Author Affiliations +
Proceedings Volume 12509, Third International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI 2022); 1250912 (2023) https://doi.org/10.1117/12.2656003
Event: Third International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI 2022), 2022, Guangzhou, China
Abstract
Some recently proposed data augmentation methods have been used to solve the overfitting problem of neural networks and are gradually becoming a research focus in deep learning. These data augmentation methods have been widely used in tasks such as image recognition, target detection, and image segmentation. However, for the overfitting problem of neural networks on video data, the existing data augmentation methods have the limitation of feature dimensionality; they can only affect the spatial features of the training samples. Moreover, the current methods lack the filtering effect on the temporal information of training samples; the training samples after data augmentation still have more rea information, and the video behavior recognition model still suffers from the overfitting problem. Therefore, this paper proposes the ShiftMask data enhancement algorithm. The method in this paper uses a new masking approach to correlatively mask the temporal and spatial dimensions of video data to help the model identify the subject of the action and reduce the subject and scene learning bias of the model during the training process. Moreover, this paper utilizes a grid-like mask of varying sizes to preserve the data's fine-grained features. Finally, experiments on the HDMB51 and TobaccoFactory datasets improved the recognition accuracy of the I3D model by 4.5% and 2.9%, which were better than other mainstream data enhancement methods.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Liang Tan, Renze Luo, Renquan Luo, Hong Yu, and Zhilin Deng "ShiftMask: video behavior recognition data augmentation", Proc. SPIE 12509, Third International Conference on Intelligent Computing and Human-Computer Interaction (ICHCI 2022), 1250912 (12 January 2023); https://doi.org/10.1117/12.2656003
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Data modeling

Statistical modeling

Performance modeling

RGB color model

Detection and tracking algorithms

Image segmentation

Back to Top