Image-Guided Procedures, Robotic Interventions, and Modeling

Intraoperative on-the-fly organ-mosaicking for laparoscopic surgery

[+] Author Affiliations
Daniel Reichard, Sebastian Bodenstedt, Stefan Suwelack, Rüdiger Dillmann, Stefanie Speidel

Karlsruhe Institute of Technology, Institute for Anthropomatics and Robotics, Adenauerring 2, D-76131 Karlsruhe, Germany

Benjamin Mayer, Anas Preukschas, Martin Wagner, Hannes Kenngott, Beat Müller-Stich

University of Heidelberg, Department of General, Abdominal and Transplantation Surgery, Im Neuenheimer Feld 110, D-69120 Heidelberg, Germany

J. Med. Imag. 2(4), 045001 (Dec 10, 2015). doi:10.1117/1.JMI.2.4.045001
History: Received August 3, 2015; Accepted November 4, 2015
Text Size: A A A

Open Access Open Access

Abstract.  The goal of computer-assisted surgery is to provide the surgeon with guidance during an intervention, e.g., using augmented reality. To display preoperative data, soft tissue deformations that occur during surgery have to be taken into consideration. Laparoscopic sensors, such as stereo endoscopes, can be used to create a three-dimensional reconstruction of stereo frames for registration. Due to the small field of view and the homogeneous structure of tissue, reconstructing just one frame, in general, will not provide enough detail to register preoperative data, since every frame only contains a part of an organ surface. A correct assignment to the preoperative model is possible only if the patch geometry can be unambiguously matched to a part of the preoperative surface. We propose and evaluate a system that combines multiple smaller reconstructions from different viewpoints to segment and reconstruct a large model of an organ. Using graphics processing unit-based methods, we achieved four frames per second. We evaluated the system with in silico, phantom, ex vivo, and in vivo (porcine) data, using different methods for estimating the camera pose (optical tracking, iterative closest point, and a combination). The results indicate that the proposed method is promising for on-the-fly organ reconstruction and registration.

Figures in this Article

The amount of minimally invasive surgeries performed yearly is increasing rapidly. This is largely due to the numerous benefits these types of intervention have on the patient side: shorter stay in hospital, less trauma, minimal scarring, and lower chance of postsurgical complications. There are several drawbacks for the surgeon, though: limited hand-eye coordination, no haptic feedback, no direct line of sight, and a limited field of view.

Computer-assisted surgery tries to alleviate some of these drawbacks by providing the surgeon with information relevant to the state of the intervention. Prior to the intervention, preoperative data are acquired for diagnosis and surgical planning. Elaborate equipment (e.g., CT or MRI) generates precise data and also allows imaging from the interior of the body. Three-dimensional (3-D) models created from this data can provide the surgeon with a virtual view inside the patient during surgery. To this end, the models have to be registered to the current surgical scene, i.e., the current location and orientation of the real structure have to match those of the virtual one. The available tools for intraoperative imaging (e.g., endoscope) are limited in image quality and field of view. But they can be used to create intraoperative surface models that enable the registration process with the preoperative data.

Many groups have explored ways to obtain intraoperative surface models. To sample an intraoperative surface, Herline et al.1 used a probe in which the tip was moved over the visible parts of the liver. The probe was localized with an active position sensor. To avoid possible tissue damage, newer approaches commonly rely on ranged sensors. Laser range scanners used by Clements et al.2 offer high reconstruction quality for conventional liver surgery. The downside is the need of additional hardware in the operating room. Dumpuri et al.3 extended this approach to take intraoperative soft tissue deformation into account. After an initial rigid registration of the laser scan and CT surfaces, the residual closest point distances between the rigidly registered surfaces are minimized using a computational approach. The method was further refined by Rucker et al.4 using a tissue mechanics model subjected to boundary conditions, which were adjusted for liver resection therapy.

For registering preoperative data in laparoscopic surgery, the organ surface can be observed with optical laparoscopic sensors that provide a 3-D-reconstruction of a single video frame. There are many methods for reconstructing 3-D surface structures.5 The most commonly used methods rely on multiple view geometry. Through correspondence analysis between two or more images, a 3-D-reconstruction can be obtained via triangulation. Structure from motion (SfM) uses one camera with images from at least two different perspectives for triangulation. A similar approach is the stereo camera. It uses two image sensors, which can be calibrated to each other. The known transformation between the two stereo images allows a more precise reconstruction. Instead of using naturally given correspondences, shape from shading algorithms use structured light for active triangulation. The structured light has to be projected onto the scene, which is proving to be difficult in surgical practice. The methods mentioned previously only reconstruct a small field of view, and due to the homogeneous structure of tissue, a single frame, in general, will not provide enough detail to rule out geometrical ambiguities (i.e., an intraoperative surface patch has multiple possible matches on the preoperative model surface) during registration.

To remedy this problem, Plantefève et al.6 used anatomical landmarks to achieve a stable initial registration. The preoperative landmarks were labeled automatically while the intraoperative labeling required manual interaction. After the initial registration, a biomechanical model and the established correspondences between the landmarks were used to counteract intraoperative soft tissue deformation and movement.

To expand the reconstructed surface, methods to associate multiple frames are needed. One of these is the procedure of localizing the camera in the world while simultaneously mapping it, known as simultaneous localization and mapping (SLAM) in literature. SLAM is a well-known approach in robotic mapping and has also found its way into computer-assisted laparoscopic surgery. Mountney et al.7 introduced an SLAM approach using a stereo endoscope to map the soft tissue of the liver. They worked with a sparse set of image texture features, which are tracked by an extended Kalman filter. In later work, the system was expanded to compensate breathing motions.8

To recover from occlusions or sudden camera movements, Puerto-Souza et al.9,10 developed a robust feature matching, the hierarchical multiaffine (HMA) algorithm. In tests with real intervention data sets, the HMA algorithm exceeded the existing feature-matching methods in the number of image correspondences, speed, accuracy, and robustness.

SLAM can also be achieved through a single moving camera. With the previous mentioned SfM technique, reconstructing 3-D scene information is possible. In the work of Grasa et al.11,12 this method is used to create a sparse reconstruction of a laparoscopic scene in real time. However, reconstructions from single camera solutions have the problem that they do not provide an absolute scale. To approach this problem, Scaramuzza et al.13 used nonholonomic constraints. Recently, Newcombe et al.14 introduced the KinectFusion method, which provides dense reconstructions of medium-sized (nonmedical) scenes in real time using a Microsoft Kinect for data acquisition. In the work of Haase et al.,15 an extension of Newcombe et al.14 is used to reconstruct the surgical situs with multiple views taken by a 160×120 pixels time-of-flight camera.

In this paper, we present a system that combines 3-D reconstructions generated online by a stereo endoscope from multiple viewpoints, while simultaneously segmenting structures on-the-fly. It is based on our previous work16 and was extended by a detailed description of the method and an extensive evaluation on in silico, phantom, ex vivo, and in vivo data. In our system, the reconstructions and the segmentations are combined into one organ model. To compute a 3-D point cloud from a stereo image pair, the hybrid recursive matching (HRM) algorithm outlined by Röhl et al.17 was used. It was compared with other 3-D surface reconstruction methods by Maier-Hein et al.18 and achieved the best results. The segmentation of the organ of interest is done on the basis of color images. Using a random forest based classifier,19 each pixel is labeled as part of an organ of interest or background. The resulting point clouds and their respective labels are then integrated into a voxel-volume using a KinectFusion based algorithm.14 Given enough viewpoints, the voxel-volume will contain a combined model more suited for registration than the model generated from single shot.

The novelty of the approach presented in this work is the application of a stereo endoscope, a modality already available in the surgical workflow, to reconstruct an entire scene from multiple viewpoints online, while simultaneously segmenting one or more organs of interest. Our main contributions are as follows:

  • Mosaicking of frame reconstruction parts using a frame-to-model registration with the possible use of a tracking device (e.g., NDI Polaris).
  • Dense surface model that is generated online and is available after each image frame.
  • Per-frame segmentation of organs is achieved through a fast graphics processing unit (GPU) random forest approach.
  • Global segmentation allows accumulation of the single-frame segmentation probabilities for each global surface point. The combined segmentation results lead to a higher and more robust recognition rate.

In the following, we will present a more detailed description of our reconstruction workflow, followed by an evaluation using in silico, phantom, ex vivo, and in vivo data (porcine). Three methods for determining the camera pose are also evaluated: optical tracking, iterative closest point (ICP) tracking, and a combination of these two methods. The evaluation and workflow are described in the context of laparoscopic liver surgery.

Our system for reconstructing the scene consists of multiple steps (Fig. 1). First, we reconstruct a 3-D point cloud from stereo image frames. At the same time, the organs of interest are segmented in the video image. Afterward, the reconstruction is combined with the segmentation results and integrated into a truncated signed distance (TSD) volume. From this volume, a mosaicked model of the combined reconstructions can be retrieved. Using a TSD volume allows us to incorporate information from different viewpoints to create a larger model than from a single view, while simultaneously reducing noise in the model.

Reconstruction and Segmentation

The stereo endoscope provides left and right camera images, which are first preprocessed to remove distortion and to rectify the image pair. Using correspondence analysis,17 we first calculate a disparity map between the two images and then triangulate those matches, resulting in a dense 3-D point cloud Ri in camera coordinates for each time step. The preprocessing and the correspondence analysis were both implemented on the GPU.

Every pixel in the scene is simultaneously classified using a random forest19 into foreground, e.g., liver, and background. As features, the hue and saturation channels from the HSV color space and the color-opponent dimensions a and b from the LAB color space were used. The classifier thus provides a mapping Ci(p){1n}, pRi from 3-D point to a class-label for each time step.

The random forest was trained on multiple previously labeled image. We trained a forest consisting of 50 trees with a maximum depth of 10. To allow real-time processing, the classification portion of the random forest was ported to the GPU.

Integration into Truncated Signed Distance-Volume

Assuming the pose Pi of the camera in each time step is known, the point clouds Ri can be transformed into the world coordinate system RiW=Pi(Ri). At every time step, RiW is integrated into a TSD volume Si(p)[Fi(p),Ki(p,j),Wi(p)], where p is a voxel in the volume. The TSD value Fi(p) and the weight Wi(p) are computed as suggested in Ref. 14. Display Formula

Fi(p)=Wi1(p)Fi1(p)+WRi(p)FRi(p)Wi1(p)+WRi(p),(1)
Display Formula
Wi(p)=Wi1(p)+WRi(p),(2)
where WRi(p) is the weight of voxel p in the current frame. It can be used to weight the TSD value computed for the current frame FRi correlated to the measurement uncertainty, or set uniformly to one. FRi can be computed as Display Formula
FRi(p)=Ψ[λ1tip2Ri(x)],(3)
Display Formula
λ=K1x˙2,(4)
Display Formula
x=KTi1p,(5)
Display Formula
Ψ(η)={min(1,ημ)sgn(η),ημundefined,else,(6)
where K is the camera calibration matrix, x˙ is the homogenized image coordinate x, . is the nearest neighbor lookup, Ti is the camera transformation, and ti is the translation part of Ti. λ1 converts the ray distance tip2 to a depth value in the camera coordinate system. The function Ψ(η) specifies the area of influence of Ri over the voxels FRi. The parameter μ is responsible for the maximal distance before the influence of a point on a voxel is truncated.

We included Ki(p,j) in the volume to account for class membership of p. Display Formula

Ki(p,j)=Wi1(p)Ki1(p,j)+WRiW(p)KRiW(p,j)Wi1+WRiW,(7)
Display Formula
KRiW(p,j)={1,ifCi[RiW(p)]=j0,else,(8)
where RiW(p) represents the point in RiW that lies in p and j stands for the classifier category (e.g., background and target structure).

The class membership Ci(p) at the current time step can then be computed as Display Formula

Ci(p)=argmaxj{1n}Ki(p,j).(9)

This way of smoothing class membership over time allows our system to cope with potential misclassifications.

Camera Pose

To integrate the point cloud Ri into the TSD volume, the pose Pi of the camera at time step i has to been known. In this paper, we consider three methods for estimating Pi.

  1. ICP: We adopt the assumption of Newcombe et al.14 that the pose of the camera changes only slightly between frames. By registering Ri with a ray cast of the TSD volume using the projective data association ICP algorithm,20 we estimate Pi. With the small movement assumption and the special ICP variant, all pixels can be used in real time.
  2. Polaris: We use the NDI Polaris optical tracking system to track both camera and the patient.
  3. Mixed: We combine the two methods by using the tracking information as a seed for the ICP.

We performed five experiments to evaluate our system using in silico, phantom, ex vivo, and in vivo livers. For each liver, a reference was computed by laser scan or CT. In each experiment, we moved the stereo endoscope over the liver and used the captured images to reconstruct and segment the liver simultaneously. For each experiment, three mosaicked models, each with a different method for tracking the camera pose, were constructed as described in Sec. 2.3. Afterward, we computed the average distance of each intraoperative reconstructed point to the reference for each model. To reduce the influence of tracking errors, the mosaicked porcine liver models were registered to the reference using ICP. For the purpose of comparison, we also computed the average distance of the unprocessed single frame point clouds RiW to the ground truth.

The camera pose used for transforming each point cloud into the world coordinate system was given by an NDI Polaris optical tracking system. For the two silicone and the first ex vivo experiment, a calibrated phase alternating line (PAL) stereo endoscope with a fixed camera unit and a PC workstation (Table 1, No. 1) were used. The second ex vivo and the in vivo experiment were conducted with a calibrated HD stereo endoscope with chip-on-the-tip technology (Table 1, No. 2).

Table Grahic Jump Location
Table 1The PC and endoscopic hardware used for evaluation. Both stereo endoscopes were calibrated before the experiments and have no zoom and fixed focusing. The run-time analysis reveals that the higher computational cost caused through higher resolution can be compensated by faster hardware.

Both configurations took, on average, 0.25s for one frame integration, implying a frame rate of 4fps. More run-time information is available in Table 1.

In Silico

In order to evaluate the mosaicking without the errors induced by the stereo matching (HRM), we used a simulation framework to generate a circular image sequence of a textured CT-liver model (Fig. 2). For each of the 320 images, depth map and camera position were computed. With the simulated input data, an accurate mosaicked reconstruction of the model was achieved (Table 2).

Graphic Jump Location
Fig. 2
F2 :

(a) Textured simulation model, and (b) error distribution using reference depth data and (c) using hybrid recursive matching (HRM).

Table Grahic Jump Location
Table 2The root mean square (RMS) error between the mosaicked models and the reference model in mm. The last column contains the RMS error, using all frames for evaluation separately. In the last row, hybrid recursive matching (HRM) is used for depth map creation instead of the reference depth data.

The simulation was also used to create noisy depth data to evaluate the mosaicking behavior on imperfect data. The noise was generated through a Perlin noise model, as it is similar to the errors made by HRM. Three different noise levels, noise 1 (mean error 1.12mm±0.86), noise 2 (2.30mm±1.68), and noise 3 (3.36mm±2.49), were used. The results are showing that the mosaicking is reducing the noise and is producing a more accurate model than the single-shot reconstructions (Fig. 3).

Graphic Jump Location
Fig. 3
F3 :

Error distribution for (a) noise 1, (b) noise 2, and (c) noise 3.

Phantom Liver

After verifying the method in silico, we performed five phantom experiments with three silicone livers (Fig. 4). The first two livers were recorded with both the Wolf and HD stereo endoscope, and the third only with the HD stereo endoscope (Fig. 5). The first (Wolf endoscope 1 and HD endoscope 1) and third (HD endoscope 3) liver were placed on a flat surface, whereas the second liver (Wolf endoscope 2 and HD endoscope 2) was placed inside a 3-D printed patient phantom (Fig. 4). As previously mentioned, an NDI Polaris optical tracking system was used for endoscope position tracking. To evaluate the ICP only approach, Polaris tracking data from the first image frame served as registration to the reference model.

Graphic Jump Location
Fig. 4
F4 :

(a) Experimental phantom setup: stereo endoscope, optical tracking system, and patient phantom. (b) Silicone liver 1, (c) silicone liver 2, and (d) silicone liver 3.

Graphic Jump Location
Fig. 5
F5 :

Error distribution (HD stereo endoscope) in (a) liver 1, (b) liver 2, and (c) liver 3. The images for the Wolf stereo endoscope can be viewed in our previous work.16

The results show that the use of an HD stereo endoscope increases the quality and stability of the method (Table 3). In combination with the Wolf stereo endoscope, our method produces the best results with Polaris mode. With the HD stereo endoscope, the best results shift toward mixed mode. Figure 6 illustrates an example of a failed reconstruction using the ICP for frame-to-model registration. Multiple consecutive frames to model registrations with high errors in position or orientation usually lead to a fracture in the final reconstruction, i.e., the spatial relation of the reconstructed parts before and after the ICP failure(s) is erroneous.

Table Grahic Jump Location
Table 3The RMS error between the mosaicked models and the reference models in mm. The use of the HD stereo endoscope caused an overall improvement of the results. The first and second recording (lines 1 and 2) illustrate the strong influence of the HRM quality in the final reconstruction. The error levels in single shot (single HRM reconstructions) are reflected in the results of the final reconstruction.
Graphic Jump Location
Fig. 6
F6 :

Example of an iterative closest point (ICP) failure in silicone liver 1. A reconstruction using only Polaris is shown in (a). The hole in the model is due to missing viewpoints. A reconstruction using ICP is shown in (b), where the mosaicking failed, creating a larger hole. The mixed approach was used in (c), which closed the hole, connecting parts of the liver that do not belong together.

To determine if the models created by our approach are suitable for registering a preoperative model in absence of soft tissue deformation, we transformed the model for silicone 1 using multiple random rigid transformations. Thereupon, we performed a rough registration of the model to the reference laser scan with fast point feature histograms21 and fine-tuned it with the use of ICP. The average distance error for 600 random transformations was 13.19mm±23.39, with 90% having an error <10mm. In comparison, using only a single frame reconstruction had an error of 89.92mm±20.48.

Ex Vivo Porcine Liver

As a first step into the real operating environment, two ex vivo porcine liver experiments were conducted. In the first experiment (liver 1), the Wolf stereo endoscope was used, and reference data were provided by a laser scan. For the second experiment (liver 2), we used the high-resolution HD Storz stereo endoscope and CT imaging as reference data (Fig. 7).

Graphic Jump Location
Fig. 7
F7 :

Error distribution in ex vivo liver experiments: (a) ex vivo with Wolf stereo endoscope. Blue signals an error <2mm and red an error >2mm. (b) Ex vivo with HD stereo endoscope.

The results from liver 1 are comparable to the phantom data, both using the same hardware, showing that the HRM can cope with real liver texture. The second experiment using the HD Storz stereo endoscope reduced the root mean square error from 4.21 to 1.51 mm (Table 4). While slightly different experiment settings could cause small differences, the grave change is certainly due to the better image quality and resolution.

Table Grahic Jump Location
Table 4The RMS error between the mosaicked models and the ground truth models in mm. The HD stereo endoscope outperforms the older Wolf stereo endoscope clearly. This demonstrates the importance of image quality and image resolution for the reconstruction result.
In Vivo Porcine Liver

To evaluate our system in an in vivo setting, we performed an animal experiment. At first, the pig was prepared for surgery and placed on the CT table (Fig. 8). After applying a pneumoperitoneum as well as placing ports for the endoscope and instruments, we recorded several image sequences featuring a sweep of the porcine liver. Shortly after each sequence, a CT scan was taken in order to evaluate the sequence, using the liver model acquired through the scan. To minimize breathing deformation between the two image modalities, respiration was paused between scan acquisitions.

Graphic Jump Location
Fig. 8
F8 :

Experimental setting for in vivo evaluation. The image sequences were taken directly on the CT table to minimize recording time between endoscopy and CT images. The last picture displays an exemplary in vivo endoscopic image.

The previous results in the second ex vivo experiment agree with the in vivo results. Both were obtained using the HD Storz stereo endoscope (Fig. 9). As in the previous ex vivo experiment, the error in all three sequences was smallest in mixed mode (Table 5). The mean error of the three mixed mode results is 0.86 mm.

Graphic Jump Location
Fig. 9
F9 :

Error distribution for in vivo liver experiment: (a) Polaris, (b) ICP, and (c) mixed.

Table Grahic Jump Location
Table 5The RMS error between the mosaicked models and the ground truth models in mm. A reference CT scan was performed before and after each sequence. The sequences differ in endoscope movement and liver coverage. The different error values between sequences indicate that the endoscope handling plays a significant role for the reconstruction quality.

In this paper, we presented an approach enabling the reconstruction and segmentation of organs from multiple viewpoints online during laparoscopic surgery. We have clearly demonstrated that mosaicking multiple reconstructions reduces the distance error when compared to single-shot reconstructions. Furthermore, we have shown that using a mosaicked model for rigid registration produces a significantly smaller error (dropping from 90 to 13 mm).

The comparison between results from the Wolf stereo endoscope and the HD stereo endoscope allows an insight into the correlation of image quality and final reconstruction result. The data suggest that image quality and image resolution are important for two steps. First, the HRM reconstruction needs a certain image quality to produce satisfactory results, e.g., good illumination, resolution, and little distortion. For the HRM, on the other hand, the increased sensor noise greatly reduces the reconstruction quality (as shown in Fig. 10). For the ICP-based methods (ICP only and mixed), the bad frame reconstruction not only affects the mosaicked model directly, but also the frame-to-model registration, as the ICP uses the frame 3-D reconstruction to register the frame to the model created so far. Without the use of Polaris tracking, multiple consecutive bad frame registrations usually lead to a complete fail of the mosaicking attempt. The Polaris localization method allows a higher HRM error tolerance since the patches are at least placed at the correct location.

Graphic Jump Location
Fig. 10
F10 :

Difficult situations: (a) our method would integrate the instruments into the volume destroying previously captured surface information near it. A poorly illuminated image is shown in (b) leading to (c) a poor HRM result.

In our experiments, Polaris tracking was necessary to achieve the best results. But advances in hardware, like HD stereo endoscopes, will make image-based tracking more robust. As shown in our ex vivo experiments, the ICP-only error dropped 78% due to the use of the better HD endoscope. Also, the mixed mode exceeds the pure Polaris method when used with the HD endoscope, meaning, the small localization errors were reduced by the ICP. This is a synergetic process as Polaris provides a good initial alignment needed for a stable ICP.

There are limitations of our work. Objects, like instruments, moving between camera and organ lead to reconstruction errors. Although the instruments are likely classified as background, they are still integrated into the voxel volume. This causes an erroneous morphing of the underlying previously captured organ surface. To fix this problem, the instruments have to be specifically classified in the image and the associated pixels then excluded from the integration process. We are currently working on a stable automatic classification of instruments to solve this problem. A general problem is the HRM reconstruction quality. Slight deviations from suitable illumination settings can lead to bad reconstruction results as shown in Fig. 10. Therefore, careful monitoring of the capture settings is needed. Since our method relies on surface sweeps, a sufficient space for endoscopic movement is required. Not enough surface area is captured for reconstruction otherwise. Finally, the frame-to-model ICP registration modes (mixed and only) are likely not suitable for organs with uniform appearances (e.g., prostate) or would at least create a higher error as with distinct shaped organs (e.g., liver or kidney).

Future research will focus on accounting for dynamic scenes, as currently only static scenes were considered, meaning that soft tissue deformation was not taken into account. Due to the shown limitations of the frame-to-model ICP registration, evaluating other methods for localization should be considered to lessen the dependency on optical tracking systems. Especially, feature-based approaches, taking advantage of the veined surface of organs and color information in general, are a promising addition to depth data only methods.

Special thanks to Rudolf Rempel and Patrick Mietkowski for the great clinical support. The present research is sponsored by the Klaus Tschira Foundation and the European Social Fund of the State Baden-Wuerttemberg. It was conducted within the setting of the research training group 1126 and the SFB/Transregio 125, Project A01 funded by the German Research Foundation. The study protocol was approved by the local ethics committee Heidelberg and by the regional committee Karlsruhe. Veterinary handling and care were provided by the staff of the Interfacultary Biomedical Research Facility at Heidelberg University. All animals were treated in compliance with the National Research Council’s criteria for humane care as described in the Guide for the Care and Use of Laboratory Animals prepared by the National Institutes of Health.

Herline  A. J.  et al., “Surface registration for use in interactive, image-guided liver surgery,” Comput. Aided Surg.. 5, (1 ), 11 –17 (2000).CrossRef
Clements  L. W.  et al., “Robust surface registration using salient anatomical features for image-guided liver surgery: algorithm and validation,” Med. Phys.. 35, (6 ), 2528 –2540 (2008).CrossRef
Dumpuri  P.  et al., “Model-updated image-guided liver surgery: preliminary results using surface characterization,” Prog. Biophys. Mol. Biol.. 103, (2 ), 197 –207 (2010).CrossRef
Rucker  D. C.  et al., “A mechanics-based nonrigid registration method for liver surgery using sparse intraoperative data,” IEEE Trans. Med. Imaging. 33, (1 ), 147 –158 (2014).CrossRef
Maier-Hein  L.  et al., “Optical techniques for 3-D surface reconstruction in computer-assisted laparoscopic surgery,” Med. Image Anal.. 17, (8 ), 974 –996 (2013).CrossRef
Plantefève  R.  et al., “Patient-specific biomechanical modeling for guidance during minimally-invasive hepatic surgery,” Ann. Biomed. Eng.. 1 –15 (2015).CrossRef
Mountney  P.  et al., “Simultaneous stereoscope localization and soft-tissue mapping for minimal invasive surgery,” in  Medical Image Computing and Computer-Assisted Intervention , pp. 347 –354,  Springer  (2006).
Mountney  P., and Yang  G.-Z., “Motion compensated SLAM for image guided surgery,” in  Medical Image Computing and Computer-Assisted Intervention , pp. 496 –504,  Springer  (2010).
Puerto-Souza  G.  et al., “A fast and accurate feature-matching algorithm for minimally-invasive endoscopic images,” IEEE Trans. Med. Imaging. 32, (7 ), 1201 –1214 (2013).CrossRef
Puerto-Souza  G. A., , Castaño-Bardawil  A., , Mariottini  G.-L., “Real-time feature matching for the accurate recovery of augmented-reality display in laparoscopic videos,” in Augmented Environments for Computer-Assisted Interventions. , and Linte  C. A.  et al., Eds., pp. 153 –166,  Springer ,  Berlin, Heidelberg  (2013).
Grasa  O. G., , Civera  J., and Montiel  J., “EKF monocular SLAM with relocalization for laparoscopic sequences,” in  2011 IEEE Int. Conf. on Robotics and Automation , pp. 4816 –4821,  IEEE  (2011).
Grasa  O. G.  et al., “Visual SLAM for handheld monocular endoscope,” IEEE Trans. Med. Imaging. 33, (1 ), 135 –146 (2014).CrossRef
Scaramuzza  D.  et al., “Absolute scale in structure from motion from a single vehicle mounted camera by exploiting nonholonomic constraints,” in  2009 IEEE 12th Int. Conf. on Computer Vision , pp. 1413 –1419,  IEEE  (2009).
Newcombe  R. A.  et al., “KinectFusion: real-time dense surface mapping and tracking,” in  Proc. of the 2011 10th IEEE Int. Symp. on Mixed and Augmented Reality , pp. 127 –136,  IEEE Computer Society ,  Washington, DC  (2011).
Haase  S.  et al., “3-D operation situs reconstruction with time-of-flight satellite cameras using photogeometric data fusion,” in Medical Image Computing and Computer-Assisted Intervention. , and Mori  K.  et al., Eds., pp. 356 –363,  Springer ,  Berlin Heidelberg  (2013).
Bodenstedt  S.  et al., “Intraoperative on-the-fly organ-mosaicking for laparoscopic surgery,” Proc. SPIE. 9415, , 94151S  (2015).CrossRef
Röhl  S.  et al., “Dense GPU-enhanced surface reconstruction from stereo endoscopic images for intraoperative registration,” Med. Phys.. 39, , 1632  (2012).CrossRef
Maier-Hein  L.  et al., “Comparative validation of single-shot optical techniques for laparoscopic 3-D surface reconstruction,” IEEE Trans. Med. Imaging. 33, (10 ), 1913 –1930 (2014).CrossRef
Schroff  F., , Criminisi  A., and Zisserman  A., “Object class segmentation using random forests,” 2008, http://research.microsoft.com/pubs/72423/Criminisi_bmvc2008.pdf (13  November  2015).
Blais  G., and Levine  M. D., “Registering multiview range data to create 3-D computer objects,” IEEE Trans. Pattern Anal. Mach. Intell.. 17, (8 ), 820 –824 (1995).CrossRef
Rusu  R. B., , Blodow  N., and Beetz  M., “Fast point feature histograms (FPFH) for 3-D registration,” in  IEEE Int. Conf. on Robotics and Automation , pp. 3212 –3217,  IEEE  (2009).

Daniel Reichard is pursuing a doctoral degree in computer science at the Institute for Anthropomatics and Robotics, KIT. He is performing research within the computer-assisted surgery junior research group. His research interests are in endoscopic three-dimensional (3-D) reconstruction and 3-D model registration for preoperative data.

Stefanie Speidel has a computer-assisted surgery junior research group at the Institute for Anthropomatics and Robotics, KIT. She received her PhD in 2009 in the context of the intelligent surgery research training group, a cooperation among KIT, the University of Heidelberg, and the German Cancer Research Center. Her research interests include endoscopic vision, intraoperative sensor analysis for context-aware assistance, and intraoperative registration with biomechanical models.

Rüdiger Dillmann received his PhD from University of Karlsruhe in 1980. Since 1987 he has been professor of the Department of Computer Science and is director of the Humanoids and Intelligence Systems Research Lab at the Karlsruhe Institute of Technology (KIT). 2002 he became director of an innovation lab at the Research Center for Information Science (FZI). Since 2009 he has been spokesman of the Institute of Anthropomatics and Robotics at KIT.

Biographies for the other authors are not available.

© The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.

Citation

Daniel Reichard ; Sebastian Bodenstedt ; Stefan Suwelack ; Benjamin Mayer ; Anas Preukschas, et al.
"Intraoperative on-the-fly organ-mosaicking for laparoscopic surgery", J. Med. Imag. 2(4), 045001 (Dec 10, 2015). ; http://dx.doi.org/10.1117/1.JMI.2.4.045001


Figures

Graphic Jump Location
Fig. 7
F7 :

Error distribution in ex vivo liver experiments: (a) ex vivo with Wolf stereo endoscope. Blue signals an error <2mm and red an error >2mm. (b) Ex vivo with HD stereo endoscope.

Graphic Jump Location
Fig. 8
F8 :

Experimental setting for in vivo evaluation. The image sequences were taken directly on the CT table to minimize recording time between endoscopy and CT images. The last picture displays an exemplary in vivo endoscopic image.

Graphic Jump Location
Fig. 9
F9 :

Error distribution for in vivo liver experiment: (a) Polaris, (b) ICP, and (c) mixed.

Graphic Jump Location
Fig. 2
F2 :

(a) Textured simulation model, and (b) error distribution using reference depth data and (c) using hybrid recursive matching (HRM).

Graphic Jump Location
Fig. 3
F3 :

Error distribution for (a) noise 1, (b) noise 2, and (c) noise 3.

Graphic Jump Location
Fig. 4
F4 :

(a) Experimental phantom setup: stereo endoscope, optical tracking system, and patient phantom. (b) Silicone liver 1, (c) silicone liver 2, and (d) silicone liver 3.

Graphic Jump Location
Fig. 5
F5 :

Error distribution (HD stereo endoscope) in (a) liver 1, (b) liver 2, and (c) liver 3. The images for the Wolf stereo endoscope can be viewed in our previous work.16

Graphic Jump Location
Fig. 6
F6 :

Example of an iterative closest point (ICP) failure in silicone liver 1. A reconstruction using only Polaris is shown in (a). The hole in the model is due to missing viewpoints. A reconstruction using ICP is shown in (b), where the mosaicking failed, creating a larger hole. The mixed approach was used in (c), which closed the hole, connecting parts of the liver that do not belong together.

Graphic Jump Location
Fig. 10
F10 :

Difficult situations: (a) our method would integrate the instruments into the volume destroying previously captured surface information near it. A poorly illuminated image is shown in (b) leading to (c) a poor HRM result.

Tables

Table Grahic Jump Location
Table 1The PC and endoscopic hardware used for evaluation. Both stereo endoscopes were calibrated before the experiments and have no zoom and fixed focusing. The run-time analysis reveals that the higher computational cost caused through higher resolution can be compensated by faster hardware.
Table Grahic Jump Location
Table 4The RMS error between the mosaicked models and the ground truth models in mm. The HD stereo endoscope outperforms the older Wolf stereo endoscope clearly. This demonstrates the importance of image quality and image resolution for the reconstruction result.
Table Grahic Jump Location
Table 2The root mean square (RMS) error between the mosaicked models and the reference model in mm. The last column contains the RMS error, using all frames for evaluation separately. In the last row, hybrid recursive matching (HRM) is used for depth map creation instead of the reference depth data.
Table Grahic Jump Location
Table 3The RMS error between the mosaicked models and the reference models in mm. The use of the HD stereo endoscope caused an overall improvement of the results. The first and second recording (lines 1 and 2) illustrate the strong influence of the HRM quality in the final reconstruction. The error levels in single shot (single HRM reconstructions) are reflected in the results of the final reconstruction.
Table Grahic Jump Location
Table 5The RMS error between the mosaicked models and the ground truth models in mm. A reference CT scan was performed before and after each sequence. The sequences differ in endoscope movement and liver coverage. The different error values between sequences indicate that the endoscope handling plays a significant role for the reconstruction quality.

References

Herline  A. J.  et al., “Surface registration for use in interactive, image-guided liver surgery,” Comput. Aided Surg.. 5, (1 ), 11 –17 (2000).CrossRef
Clements  L. W.  et al., “Robust surface registration using salient anatomical features for image-guided liver surgery: algorithm and validation,” Med. Phys.. 35, (6 ), 2528 –2540 (2008).CrossRef
Dumpuri  P.  et al., “Model-updated image-guided liver surgery: preliminary results using surface characterization,” Prog. Biophys. Mol. Biol.. 103, (2 ), 197 –207 (2010).CrossRef
Rucker  D. C.  et al., “A mechanics-based nonrigid registration method for liver surgery using sparse intraoperative data,” IEEE Trans. Med. Imaging. 33, (1 ), 147 –158 (2014).CrossRef
Maier-Hein  L.  et al., “Optical techniques for 3-D surface reconstruction in computer-assisted laparoscopic surgery,” Med. Image Anal.. 17, (8 ), 974 –996 (2013).CrossRef
Plantefève  R.  et al., “Patient-specific biomechanical modeling for guidance during minimally-invasive hepatic surgery,” Ann. Biomed. Eng.. 1 –15 (2015).CrossRef
Mountney  P.  et al., “Simultaneous stereoscope localization and soft-tissue mapping for minimal invasive surgery,” in  Medical Image Computing and Computer-Assisted Intervention , pp. 347 –354,  Springer  (2006).
Mountney  P., and Yang  G.-Z., “Motion compensated SLAM for image guided surgery,” in  Medical Image Computing and Computer-Assisted Intervention , pp. 496 –504,  Springer  (2010).
Puerto-Souza  G.  et al., “A fast and accurate feature-matching algorithm for minimally-invasive endoscopic images,” IEEE Trans. Med. Imaging. 32, (7 ), 1201 –1214 (2013).CrossRef
Puerto-Souza  G. A., , Castaño-Bardawil  A., , Mariottini  G.-L., “Real-time feature matching for the accurate recovery of augmented-reality display in laparoscopic videos,” in Augmented Environments for Computer-Assisted Interventions. , and Linte  C. A.  et al., Eds., pp. 153 –166,  Springer ,  Berlin, Heidelberg  (2013).
Grasa  O. G., , Civera  J., and Montiel  J., “EKF monocular SLAM with relocalization for laparoscopic sequences,” in  2011 IEEE Int. Conf. on Robotics and Automation , pp. 4816 –4821,  IEEE  (2011).
Grasa  O. G.  et al., “Visual SLAM for handheld monocular endoscope,” IEEE Trans. Med. Imaging. 33, (1 ), 135 –146 (2014).CrossRef
Scaramuzza  D.  et al., “Absolute scale in structure from motion from a single vehicle mounted camera by exploiting nonholonomic constraints,” in  2009 IEEE 12th Int. Conf. on Computer Vision , pp. 1413 –1419,  IEEE  (2009).
Newcombe  R. A.  et al., “KinectFusion: real-time dense surface mapping and tracking,” in  Proc. of the 2011 10th IEEE Int. Symp. on Mixed and Augmented Reality , pp. 127 –136,  IEEE Computer Society ,  Washington, DC  (2011).
Haase  S.  et al., “3-D operation situs reconstruction with time-of-flight satellite cameras using photogeometric data fusion,” in Medical Image Computing and Computer-Assisted Intervention. , and Mori  K.  et al., Eds., pp. 356 –363,  Springer ,  Berlin Heidelberg  (2013).
Bodenstedt  S.  et al., “Intraoperative on-the-fly organ-mosaicking for laparoscopic surgery,” Proc. SPIE. 9415, , 94151S  (2015).CrossRef
Röhl  S.  et al., “Dense GPU-enhanced surface reconstruction from stereo endoscopic images for intraoperative registration,” Med. Phys.. 39, , 1632  (2012).CrossRef
Maier-Hein  L.  et al., “Comparative validation of single-shot optical techniques for laparoscopic 3-D surface reconstruction,” IEEE Trans. Med. Imaging. 33, (10 ), 1913 –1930 (2014).CrossRef
Schroff  F., , Criminisi  A., and Zisserman  A., “Object class segmentation using random forests,” 2008, http://research.microsoft.com/pubs/72423/Criminisi_bmvc2008.pdf (13  November  2015).
Blais  G., and Levine  M. D., “Registering multiview range data to create 3-D computer objects,” IEEE Trans. Pattern Anal. Mach. Intell.. 17, (8 ), 820 –824 (1995).CrossRef
Rusu  R. B., , Blodow  N., and Beetz  M., “Fast point feature histograms (FPFH) for 3-D registration,” in  IEEE Int. Conf. on Robotics and Automation , pp. 3212 –3217,  IEEE  (2009).

Some tools below are only available to our subscribers or users with an online account.

Related Content

Customize your page view by dragging & repositioning the boxes below.

Related Book Chapters

Topic Collections

PubMed Articles
Tumor labeling in vivo using cyanine-conjugated monoclonal antibodies. Cancer Immunol Immunother 1995;41(4):257-63.
Advertisement
  • Don't have an account?
  • Subscribe to the SPIE Digital Library
  • Create a FREE account to sign up for Digital Library content alerts and gain access to institutional subscriptions remotely.
Access This Article
Sign in or Create a personal account to Buy this article ($20 for members, $25 for non-members).
Access This Proceeding
Sign in or Create a personal account to Buy this article ($15 for members, $18 for non-members).
Access This Chapter

Access to SPIE eBooks is limited to subscribing institutions and is not available as part of a personal subscription. Print or electronic versions of individual SPIE books may be purchased via SPIE.org.