In the field of computer vision and graphics, high-quality reconstruction of the human body in static scenes has been achieved in recent years by a single multilayer perceptron (MLP) in a number of approaches. However, MLPs have capacity limitations, requiring substantial training time and computational resources for dynamic scene reconstruction, and the quality of reconstruction is significantly constrained. We propose a method for effectively processing complex spatiotemporal signals in dynamic scene human three-dimensional (3D) modeling. The proposed method uses temporal residual neural radiance fields to achieve novel view rendering and new pose synthesis of human bodies. To address the problem of representing temporal signals in video sequences, we construct a temporal residual field that is not related to the MLP architecture. Second, to improve the reconstruction efficiency, we propose an integrated approach that reduces the trainable parameters and accelerates rendering, thereby enhancing the network’s feature representation capability. Finally, we design a multi-dimensional loss function to accurately measure the loss between predicted and actual spatial pixel values. The experimental results show that our proposed approach improves the peak signal-to-noise ratio and structural similarity index accuracy metrics compared with the latest representative methods. It maintains a similar accuracy to Anim-NeRF and Neural Body while achieving a nearly 780-fold increase in time efficiency. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Education and training
Video
3D modeling
Video acceleration
Neural networks
Modeling
Image restoration