This paper introduces an algorithm to solve the the anomaly behavior detection problem of surveillance video through an improved autoencoder with multimodal inputs. Using 3D convolution and 3D deconvolution, and the decoder adds a feature map corresponding to the encoder on a specific layer to enhance the image detail information. Taking the RGB frame and the optical flow as inputs, abnormality scores are calculated according to the reconstruction error for locating the abnormal segment. Experiments conducted in the CUHK Avenue dataset, the UCSD Pedestrian dataset and the Behave dataset, our approach works best compare to the original approach. While improving the AUC, due to the use of unsupervised learning, a lot of labeling time is saved, which is more in line with the diversity and contingency of abnormal behavior in real life.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.