Speech enhancement is a critical part of variety types of communication systems and automatic speech recognition (ASR) applications. In this study we propose a speech enhancement method for real time VoIP applications with stacked frames and deep neural network, a novel data preparation approach is also introduced. In contrast to many states of art learning-based method, we focused on real-time implement in VoIP applications. Experiments were conducted on speech degraded by different noise types and SNR levels which were not seen in the training stage of the deep neural network and achieved a significant improvement on PESQ. Important traditional real-time speech enhancement method and most recent states of art learning-based method were also tested and compared with proposed method. The results show that proposed method effectively improve the speech intelligibility, greatly outperform traditional real-time minimum-mean square error (MMSE) algorithm and real-time learning-based CNN method in PESQ. We also achieve comparable PESQ in comparison with most recent state of the art learning-based method, but outperform it in time complexity. Making this method attractive in VoIP communication system applications which is high demand on communication latency.
Transmission map estimation is one of the most important parts of single image dehazing, which is known as an under-constraint problem. Various assumptions, some have been summarized as priors or models, are proposed to solve this problem. However, many previous methods cannot honestly reflect their theory in the results, due to not considering all of the employed assumptions simultaneously. Meanwhile, most other methods avoiding this defect are with inappropriate assumptions. We try to solve this problem by proposing a method that simultaneously considers the dark channel prior and the piecewise smoothness assumption. It is achieved by minimizing an energy function based on all of the employed assumptions. To maintain a reasonable run time, the minimization problem is mapped into a graph cut problem with a specific graph build strategy. The method is compared with state-of-the-art methods on both synthetic and natural images. Experimental results show that the proposal is promising for haze removal quality.
Camera calibration is a fundamental task in computer vision and photogrammetry. This paper presents an approach for automatic estimation of intrinsic parameters from images by using vanishing points with orthogonal directions. Firstly, image lines are extracted and are clustered into groups corresponding to three dominant vanishing points. Second, camera parameters including the radial distortion are estimated by adjusting linear parameters locating the images. Finally, the rotation matrix of the projection matrix is computed from the vanishing points, and the image edges and the translation matrix are obtained with the help of additional translation motion between the viewpoints. Our approach does not need any a priori information about the cameras being used. Experiments to evaluate the performance of this approach on synthetic and real data are reported.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.