Approximately 37 million falls occur each year worldwide requiring medical attention. Victims are often helpless and not able to call for help, which is a risk for elderly persons living alone. To detect falls at home, several approaches have been proposed. Video cameras are used increasingly. Recently, high accuracy in real-time human pose estimation in videos has been achieved by novel machine learning techniques. In this work, we propose a multi-camera system for video-based fall detection. We augment human pose estimation (OpenPifPaf algorithm) by support for multi-camera and multi-person tracking and a long short-term memory (LSTM) neural network to predict two classes: “Fall” or “No Fall”. From the poses, we extract five temporal and spatial features which are processed by the LSTM. For evaluation of identification and tracking with multiple cameras, we used videos recorded in a smart home (living lab) with two persons walking and interacting. For evaluation of fall detection, we used the UP-Fall Detection dataset and achieve an F1 score of 92.5%. We observed a tendency towards false positive classifications due to lack of activities in publicly available datasets that look similar to falls but stem from normal activity. Moreover, the lack of variation in the activities also results in a higher amount of false positives. This requires the acquisition of more balanced datasets in future work. In conclusion, real-time fall detection from multiple camera inputs and for multiple persons is feasible using a LSTM neural network combined with features obtained via human pose estimation. Source code is available at https://github.com/taufeeque9/HumanFallDetection.
|