Laparoscopic videos can be affected by different distortions which may impact the performance of surgery and introduce surgical errors. In this work, we propose a framework for automatically detecting and identifying such distortions and their severity using video quality assessment. There are three major contributions presented in this work (i) a proposal for a novel video enhancement framework for laparoscopic surgery; (ii) a publicly available database for quality assessment of laparoscopic videos evaluated by expert as well as non-expert observers and (iii) objective video quality assessment of laparoscopic videos including their correlations with expert and non-expert scores.
In minimally invasive surgery, smoke generated by such as electrocautery and laser ablation deteriorates image quality severely. This creates discomfortable view for the surgeon which may increase surgical risk and degrade the performance of computer assisted surgery algorithms such as segmentation, reconstruction, tracking, etc. Therefore, real-time smoke removal is required to keep a clear field of view. In this paper, we propose a real-time smoke removal approach based on Convolutional Neural Network (CNN). An encoder-decoder architecture with Laplacian image pyramid decomposition input strategy is proposed. This is an end-to-end network which takes the smoke image and its Laplacian image pyramid decomposition as inputs, and outputs a smoke free image directly without relying on any physical models or estimation of intermediate parameters. This design can be further embedded to deep learning based follow-up image guided surgery processes such as segmentation and tracking tasks easily. A dataset with synthetic smoke images generated from Blender and Adobe Photoshop is employed for training the network. The result is evaluated quantitatively on synthetic images and qualitatively on a laparoscopic dataset degraded with real smoke. Our proposed method can eliminate smoke effectively while preserving the original colors and reaches 26 fps for a video of size 512 × 512 on our training machine. The obtained results not only demonstrate the efficiency and effectiveness of the proposed CNN structure, but also prove the potency of training the network on synthetic dataset.
Quantifying image quality without reference is still a challenging problem, especially when different distortions affect the observed image. A no-reference image quality assessment (NR-IQA) metric is proposed. It is based on a fusion scheme of multiple distortion measures. This metric is built in two stages. First, a set of relevant IQA metrics is selected using a particle swarm optimization scheme. Then, a support vector regression (SVR)-based fusion strategy is adopted to derive the overall index of image quality. The obtained results demonstrate clearly that the proposed approach outperforms the state-of-the-art NR-IQA methods. Furthermore, the proposed approach is flexible and could be extended to other distortions.
Human action recognition has drawn much attention in the field of video analysis. In this paper, we develop a human action detection and recognition process based on the tracking of Interest Points (IP) trajectory. A pre-processing step that performs spatio-temporal action detection is proposed. This step uses optical flow along with dense speed-up-robust-features (SURF) in order to detect and track moving humans in moving fields of view. The video description step is based on a fusion process that combines displacement and spatio-temporal descriptors. Experiments are carried out on the big data-set UCF-101. Experimental results reveal that the proposed techniques achieve better performances compared to many existing state-of-the-art action recognition approaches.
Kernel-design based method such as Bilateral filter (BIL), non-local means (NLM) filter is known as one of the
most attractive approaches for denoising. We propose in this paper a new noise filtering method inspired by BIL,
NLM filters and principal component analysis (PCA). The main idea here is to perform the BIL in a multidimensional
PCA-space using an anisotropic kernel. The filtered multidimensional signal is then transformed back
onto the image spatial domain to yield the desired enhanced image. In this work, it is demonstrated that the
proposed method is a generalization of kernel-design based methods. The obtained results are highly promising.
No Reference Image Quality Metrics proposed in the literature are generally developed for specific degradations,
limiting thus their application. To overcome this limitation, we propose in this study a NR-IQM for ringing and blur
distortions based on a neural weighting scheme. For a given image, we first estimate the level of blur and ringing
degradations contained in an image using an Artificial Neural Networks (ANN) model. Then, the final index quality is
given by combining blur and ringing measures by using the estimated weights through the learning process. The obtained
results are promising.
A New Global Full-Reference Image Quality System based on classification and fusion scheme is proposed. It consists
of many steps. The first step is devoted to the identification of the type of degradation contained in a given image based a
Linear Discriminant Analysis (LDA) classifier using some common Image Quality Metric (IQM) as feature inputs. An
IQM per degradation (IQM-D) is then used to estimate the quality of the image. For a given degradation type, the
appropriate IQM-D is derived by combining the top three best IQMs using an Artificial Neural Network model. The
performance of the proposed scheme is evaluated first in terms of good degradation identification. Then, for each
distortion type the image quality estimation is evaluated in terms of good correlation with the subjective judgments using
the TID 2008 image database.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.