As the demand for long-term health evaluation grows, researchers show increased interest in remote photoplethysmography studies. However, conventional methods are vulnerable to noise interference caused by non-rigid facial movements (facial expression, talking, etc.). Consequently, avoiding these interferences and improving the remote photoplethysmography (rPPG) signal quality become important tasks during heart rate (HR) estimation. We propose an approach that extracts high-quality rPPG signals from various subregions of the face by fusing static and dynamic weights and then employs the convolutional neural network to estimate HR value by converting the 1D rPPG signal into 2D time-frequency analysis maps. Specifically, chrominance features from various regions of interest are used to generate the raw subregion rPPG signal set that is further utilized to estimate the static weights of different regions through a clustering method. Additionally, a measurement method called enclosed area distance is proposed to perform static weights estimation. The dynamic weights of different regions are calculated using the 3D-gradient descriptor to eliminate motion interference, which evaluates the inactivation degree under regional movement situations. The final rPPG signal is reconstructed by combining the rPPG signals from the different subregions using the static and dynamic weights. The experiments are conducted on two widely used public datasets, i.e., MAHNOB-HCI and PURE. The results demonstrate that the proposed method achieves 3.12 MAE and 3.78 SD on MAHNOB-HCI and the best r on the PURE, which significantly outperforms state-of-the-art methods.
Heart rate (HR) measurement from facial videos makes physiologic detection more convenient, which achieves much attention of many researchers. However, the existing methods are not very robust against motion disturbance caused by nonrigid movements of the face (e.g., smiling, yawning, and talking). This is due to the illumination variation on the local skin region, which results in the fluctuation of color intensity values in the region of interests. Different from previous approaches using fixed regions, we propose a self-adaptive region selection algorithm and then learn a 1D convolution neural network (CNN) by fusing these selected regions for HR estimation. Concretely, the observation blood volume pulse (BVP) matrix with sub-BVP is constructed from different separated regions and denoised by the rank minimization. To explore the sub-BVP of different regions, a clustering-based method is designed to generate the regional weights. In addition, a self-adaptive region selection is constructed to dynamically filter some useful face regions for robust HR estimation. Eventually, the regions and weights are separately embedded into a 1D CNN by a multiregion-fusion module. Experiments on the MAHNOB-HCI database show that the proposed approach is effective and reaches the best performance over five metrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.