Physics of Medical Imaging

Optical tomography with discretized path integral

[+] Author Affiliations
Bingzhi Yuan, Toru Tamaki, Takahiro Kushida, Bisser Raytchev, Kazufumi Kaneda

Hiroshima University, Department of Information Science, Graduate School of Engineering, 1-4-1 Kagamiyama, Higashi-Hiroshima 739-8527, Japan

Yasuhiro Mukaigawa, Hiroyuki Kubo

Graduate School of Information Science, NAIST, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan

J. Med. Imag. 2(3), 033501 (Aug 13, 2015). doi:10.1117/1.JMI.2.3.033501
History: Received January 16, 2015; Accepted July 10, 2015
Text Size: A A A

Open Access Open Access

Abstract.  We present a framework for optical tomography based on a path integral. Instead of directly solving the radiative transport equations, which have been widely used in optical tomography, we use a path integral that has been developed for rendering participating media based on the volume rendering equation in computer graphics. For a discretized two-dimensional layered grid, we develop an algorithm to estimate the extinction coefficients of each voxel with an interior point method. Numerical simulation results are shown to demonstrate that the proposed method works well.

Figures in this Article

Optical tomography18 is known as a safer alternative to x-ray tomography. Usually, tomography consists of a light source generating penetrative light and a detector capturing the light, which allows to estimate the inside of the object through which the light is passing. The most important application is x-ray computed tomography (CT), where x rays are used due to their penetrative property. The balance between the radiation exposure of the human body and the quality of the obtained results has been debated since the early days when x-ray CT was invented. Therefore, there is an urgent demand for a safer medical tomography, such as optical tomography.

Modeling the behavior of light plays an important role in optical tomography, and in the mesoscale, in which the wavelength of light is close to the scale of tissue, the radiative transport equation (RTE) is used for describing the behavior of light scattering.5,9 At the macroscale,6 the time-independent or dependent RTE is often approximated with a diffusion equation.

Similarly, the computer graphics community has the used time-independent RTE, and in contrast to the (surface) rendering equation,10,11 often calls it the volume rendering equation (VRE).10,12Display Formula

(ω·)L(x,ω)=σt(x)L(x,ω)+σs(x)S2fp(x,ω,ω)L(x,ω)dω,(1)
and the notations will be introduced in the following sections. The use of VRE enables us to render volumes of participating media, such as fog, cloud, and fire through which light is penetrating, and to obtain realistic volume rendering images of such scenes.13,14 The path integral, which can be considered as a discrete version of the continuous Feynman path integral,15,16 has been recently employed to solve the VRE in an efficient way with Monte Carlo integration, such as Metropolis light transport17,18 or bidirectional path tracing.19

In this paper, we propose an optical tomography method using path integral as a forward model and solving a nonlinear inverse problem that minimizes the discrepancy between measurements and model predictions in a least-squares sense. To the best of our knowledge, the discretized path integral has not been used in optical tomography before. In our work, we simplify the path integral with some assumptions. The path integral, as the name suggests, gathers (or integrates) the contributions of all possible paths of light.17,18,2023 We approximate the integral of an infinite number of paths with the sum of a finite number of paths, discretize a continuous medium into voxels of a regular grid, and continuous light paths into discrete ones (i.e., polylines). We deal with anisotropic scattering having a peak in the forward direction, which is different from other discretization methods using discrete ordinate or spherical harmonics.13,24,25 In this work, we focus on estimating the spatially varying extinction coefficient σt(x) at each discretized voxel location of the medium while fixing scattering properties (e.g., scattering coefficients σs and phase functions fp). By separating the scattering properties from our problem, we formulate optical tomography as an optimization problem with inequality constraints solved by an interior point method.

An interior point method26 is an iterative method to solve an optimization problem with inequality constraints describing a feasible region in which the optimal solution must reside. To this end, a series of nonconstrained optimization problems are constructed by combining the constraints and the original objective function and are solved by an ordinal gradient-based (Quasi-Newton) method.

To summarize our contribution, we reformulate the problem of optical tomography by combining a path integral with several simplifying assumptions to model the light transport in the participating media. This paper is an extension of our previous conference version27,28 with additional theoretical background and additional experiments and discussions, and is structured as follows. In Sec. 2, we briefly review previous work related to path integrals and optical tomography. In Sec. 3, we describe how to model the light transport in participating media and turn optical tomography into an optimization problem. In Sec. 4, we show how to solve the optimization problems. Section 5 reports some simulation results, and Sec. 6 concludes the paper.

In this section, we briefly review related work on optical tomography and path integrals in computer graphics.

Optical tomography4,5 (or inverse transport,6,7 inverse scattering,29 scattering tomography30,31) is a problem in medical imaging using light sources to reconstruct the optical properties of tissue from measurements (time-dependent or stationary, angular-dependent or independent) at the surface boundary. Analytically solving the RTE [Eq. (1)] with boundary conditions is difficult, however, and approximations, such as discrete ordinates and N’th-order spherical harmonics (PN approximation), are often used and solved numerically by, for example, finite element methods (FEM) or finite difference methods. The famous diffuse approximation5,6 (DA) is a P1 (thus first-order) approximation with the assumption on a phase function being isotropic. The DA is an approximation to RTE at a macroscopic scale when scattering is large while absorption is low and scattering is not highly peaked. Diffuse optical tomography (DOT) is based on DA and today represents the frontier of optical tomography32,33 with many clinical applications.34 DA, however, does not often hold in realistic participating (scattering) media; absorption may not be small compared to scattering, and the shapes of the phase functions can be highly peaked in the forward direction, which is often modeled by Henyey-Greenstein,35 Schlick,36 or Mei and Rayleigh phase functions.10,12,37,38 Experimental evidence39 also suggests a highly peaked shape of the phase functions in biological media. DOT works, but is still limited; therefore, other methods have also been studied for cases when DA does not hold.

Statistical Monte Carlo methods are used for media in which the assumptions do not hold; however, they are computationally intensive and inefficient for solving the forward problem,47,34 i.e., solving the RTE with given parameters. Therefore, Monte Carlo based approaches have been used for estimating the spatially constant (not varying) parameters in homogeneous media, such as paper,40,41 clouds,42 liquids,43 plastics,44 or uniform material samples.45 Another difficulty of Monte Carlo based inverse methods is that an analytical forward model prediction is hard to obtain when we want to minimize the difference between the prediction and measurements except for very special structures.46,47 A gradient based least-square approach has been proposed but only for spatially constant parameter estimation,40,41,48 while model-free approaches have relied on genetic algorithms,42,44 numerical perturbation,49,50 voting,51 or even simple backprojection.52 One of the contributions of the current paper is to enable us to use a gradient based optimization approach for estimating spatially varying parameters, which is extensible by using many optimization methods.

Similar to optical tomography, modeling light transport plays a very important role in computer graphics. Our own work on optical tomography is inspired by Monte Carlo based statistical methods. In the last two decades, methods based on path integrals1719,5355 have provided models of light transport for efficient volume rendering. For solving RTE, a path integral has been used for a forward problem solver,16,56,57 and has also been applied to optical tomography, but under the diffusion assumption.58,59 Our proposed method is based on a path integral to explicitly express the forward model prediction, which is very suitable for solving the inverse problem with gradient based methods. This is an advantage of our method over existing methods because the paths used in the forward model can be generated by either a deterministic or statistical (Monte Carlo) method. To achieve an efficient forward model, we introduce a simplified layered scattering model that uses a limited number of deterministic paths instead of Monte Carlo simulated ones.

We deal with the following optical tomography problem [this is a conceptual formulation and the actual problem is shown in Eq. (29)]. Display Formula

minσti,j|IijPij(σt)|2,(2)
where σt is a vector representing the spatial distribution of the extinction coefficients to be estimated. We divide our discussion into two parts: forward and inverse problems. The forward problem focuses on building a mathematical model Pij(σt) of the light transport between a light source i and a detector j. We will make some assumptions on the light transport and the medium to simplify the forward model. An inverse problem minimizes the difference between the observations Iij of the detector and the forward model to estimate the spatial distribution of the extinction coefficients σt.

Forward Model

In the forward problem, as we mentioned before, we use a path integral to build a mathematical model for the light transport. Here, we follow the notation developed in the computer graphics literature17,23,53,60 to introduce the path integral. Sections 3.2 to 3.6 will show the simplified model we propose.

Given a space R3, a light source is located at x0R3 and a detector at xM+1R3, and in between them is the participating media νR3 with boundary ν and interior volume ν0νν. A light path x˜ connecting x0 and xM+1 of length M+2 consists of M+2 vertices xmR3 for m=0,1,,M+1, denoted by x˜=x0,x1,,xM,xM+1. Thus, absorption, scattering, or reflection events happen at x1,,xM. The set of all paths of length M is denoted by ΩM. The path space Ω is the countable set of all paths ΩM of finite length. Display Formula

Ω=M=2ΩM.(3)

A direction is denoted by ωS2, where S2 is a unit sphere in R3. A unit vector ωxm,xm+1 is the direction from vertex xm to vertex xm+1 in a path x˜.

Veach20 introduced a framework representing the rendering equation in the form of a path integral for scenes without participating media (i.e., no scattering), and later, Pauly et al.17 extended it to the volume rendering equation with scattering. The amount of light I observed by the detector is given by the path integral Display Formula

I=Ωf(x˜)dμ(x˜),(4)
which is an integral over the path space. Here, μ(x˜) is a measure of path x˜. Display Formula
dμ(x˜)=m=0M+1dμ(xm),dμ(xm)={dA(xm),xmνdV(xm),xmν0,(5)
where dμ(xm) denotes the differential measure at vertex xm. f(x˜) is a measurement contribution function defined as follows: Display Formula
f(x˜)=Le(x0,x1)G(x0,x1)[m=1Mff(xm1,xm,xm+1)G(xm,xm+1)]We(xM,xM+1),(6)
where We(xM,xM+1) is the camera response function, and Le(x0,x1) is the intensity of the light emitted from the light source x0 to vertex x1. ff(xm1,xm,xm+1) is a scattering kernel at xm with respect to the locations of vertices xm1 and xm+1. Display Formula
ff(xm1,xm,xm+1)={fs(xm1,xm,xm+1),xmνσs(xm)fp(xm1,xm,xm+1),xmν0.(7)

Here, the bidirectional scattering distribution function fs(xm1,xm,xm+1) is used for locations on the surface of objects, and the scattering coefficient σs(xm) at xm and the phase function fp(xm1,xm,xm+1) are used for those inside the medium. G(xm,xm+1) is a generalized geometric term. Display Formula

G(xm,xm+1)=T(xm,xm+1)g(xm,xm+1),(8)
where g(xm,xm+1) is a geometric term. Display Formula
g(xm,xm+1)={|ng(xm)·ωxm,xm+1|xmxm+12,xmν1xmxm+12,xmν0,(9)
with unit normal ng(xm) of the surface at xmν. T(xm,xm+1) is a transmittance that describes the attenuation when light passes through the medium. Display Formula
T(xm,xm+1)={eτ(xm,xm+1),{xm,xm+1}ν0ν0,otherwise,(10)
Display Formula
τ(xm,xm+1)=01σt[(1s)xm+sxm+1]ds,(11)
where σt(xm) is the extinction coefficient at vertex xm.

Putting all together, we have a path integral of the following infinite sum of all possible path contributions. Display Formula

I=M=2kΩMLe(x0,x1)G(x0,x1)[m=1Mff(xm1,xm,xm+1)G(xm,xm+1)]We(xM,xM+1)m=0M+1dμ(xm).(12)

Note that all vertices {xm} depend on a path k; different paths have different sets of vertices. In the equation above, however, we omit the path index k for simplicity. Later, we will again use k as the path index.

Assumptions on the Path Integral Formulation

As our target is optical tomography, we restrict the model to deal with inside the participating media. To do so, we assume that the light source x0 and detector xM+1 are located on the surface, and the other vertices x1,x2,,xM,xM+1 are inside the medium, that is, x0,xM+1ν and x1,,xMν0. Then the transmittance is simplified as Display Formula

T(xm,xm+1)=eτ(xm,xm+1).(13)

Furthermore, we assume that the observations are ideal and the camera response function is the identity, We(xM,xM+1)=1.

Apart from the assumptions above, we rewrite the geometric term and the differential measure. The definitions above use area measures dA(xm) and volume measures dV(xm) along with the squared distance geometric term;17,23,53 however, steradian measures dω(xm) and the identity geometric term is equivalent and also widely used.10,12,60Display Formula

g(xm,xm+1)dμ(xm)=dω(xm).(14)

Therefore, we employ the steradian measures and rewrite it as follows: Display Formula

g(xm,xm+1)=1,(15)
Display Formula
dμk(xm)={dA(x0),m=0dω(xm),m=1,,M+1.(16)

Now, Eq. (12) is written as Display Formula

I=M=2kΩMLe(x0,x1)T(x0,x1)dA(x0)[m=1Mff(xm1,xm,xm+1)T(xm,xm+1)dω(xm)]dω(xM+1).(17)

Discretization of the Forward Model

For numerical computation, we first discretize the medium into voxels of a regular grid, where each voxel has its own extinction coefficient σt[b] (b is the index of the voxel) as shown in Fig. 1.

Graphic Jump Location
Fig. 1
F1 :

Illustration of a discretization example. (a) Voxelization of the medium into a regular grid of size 5×5. Voxels are indexed in raster scan order in this example, from left to right, and top to bottom. Each voxel b has extinction coefficient σt[b]. (b) A path segment between vertices x1 and x2. Voxels involved in the segment are shaded. (c) Lengths d12[b] of the involved voxels b=2,3,8,9. Here we denote d12[b] instead of dx1,x2[b] for simplicity.

With this voxelization, the paths of light are also divided into segments, as explained below. First, we explain the integral [Eq. (11)] along a single segment xm,xm+1 of a path x˜. It describes the attenuation of light along the segment due to the extinction coefficients of the voxels involved. Because of the discretization of the medium, Eq. (11) can be written as a sum of voxel-wise multiplications. Display Formula

τ(xm,xm+1)=01σt[(1s)xm+sxm+1]ds=bBxm,xm+1σt[b]dxm,xm+1[b]=σtTdxm,xm+1.(18)

For the second equality, b is the index of a set Bxm,xm+1 of all voxels involved by segment xm,xm+1, and dxm,xm+1[b] is the length of the part of the segment xmxm+1 passing through voxel b. This is illustrated in Fig. 1(c). The extinction coefficient σt is now a piece-wise constant function because of the voxelization; then the integral turns into a sum (the idea that this integral can be turned into a sum has been discussed before,61 however, not in the context of tomography).

This simplifies the computation; however, the sum over a set Bxm,xm+1 is not preferable in terms of implementation and optimization. We propose here to use a vector representation of both extinction coefficients and segment lengths, which is the third equality of the above equation. The first vector σt stores the values of the extinction coefficients σt[b] of all voxels. This vector can be generated by serializing the voxels on the grid in a certain order. The second vector dxm,xm+1 contains the values of the lengths dxm,xm+1[b] for all voxels. We should note that this vector is very sparse; most of the voxels have no intersection with the segment xm,xm+1. Hence, only a few elements in dxm,xm+1 have nonzero values, and the other elements are zero because those voxels b have no intersection and dxm,xm+1[b]=0.

This sparsity of the vector facilitates the construction of a whole path x˜ because path segments can be added as follows: Display Formula

Dk=m=0Mdxm,xm+1,(19)
where Dk is the vector of a complete path k of length M+2; the b’th element can be interpreted as the length of the segment when the path passes through voxel b. This notation simplifies a part of Eq. (17) as follows: Display Formula
m=0MT(xm,xm+1)=m=0Meτ(xm,xm+1)=em=0Mτ(xm,xm+1)=em=0MσtTdxm,xm+1=eσtTDk.(20)

Using this notation to rewrite Eq. (17), we have Display Formula

I=M=2Le(x0,x1)kΩMHkeσtTDk=Le(x0,x1)kΩHkeσtTDk,(21)
where the factor Hk, defined as Display Formula
Hk=dA(x0)dω(xM+1)m=1Mff(xm1,xm,xm+1)dω(xm),(22)
describes the contributions of the scattering coefficients and phase functions, and the exponential factor represents attenuation due to absorption (and outscattering) over the path.

Two-Dimensional Layered Model of Forward Scattering

As a first attempt, we design a two-dimensional (2-D) layered grid, instead of the three-dimensional (3-D) one. Since we voxelize the medium into a regular grid, the 2-D medium consists of parallel layers. Hereafter, a 3-D direction ω between vertices is written as a 2-D direction θ and a steradian measure dω as an angular measure dθ.

As shown in Fig. 2, we assume a particular layer scattering having the following properties. First, vertices x1,,xM of path x˜ are located at the centers of each voxel. The light source x0 is located on the boundary of the top surface of the voxels in the top layer. Similarly, the detector xM+1 is located on the boundary of the bottom surface of the voxels in the bottom layer. Second, directions θx0,x1 and θxM,xM+1 at the beginning and end of a path are perpendicular to the boundary. This means that scattering begins at x1 and ends at xM. Third, forward scattering happens layer by layer. More specifically, light is scattered at the center of a voxel in a layer and then goes to the center of a voxel in the next (below) layer. Scattering is assumed to happen every time the light traverses voxel centers. Even if the next voxel is just below the current voxel and the path segment is straight, it is regarded as scattering. Fourth, the scattering coefficient is uniform, σs(x)=σs.

Graphic Jump Location
Fig. 2
F2 :

Proposed two-dimensional layered model of scattering. This example shows path x˜ consisting of vertices x1,,xM located at the centers of voxels in a grid with M parallel layers. x0 is a light source located on the top surface, and xM+1 is a detector at the bottom. At each vertex, the light scatters to voxels in the next layer, and possible scattering directions are indicated by arrows.

By ignoring paths exiting from the sides of the grid, the number of all possible paths is NM, where M is the number of layers and N is the number of voxels in one layer.

Approximating the Phase Function with a Gaussian

We use a Gaussian model fp(θ,σ2) as an approximation of the phase function Display Formula

fp(xm1,xm,xm+1)fp(θm,σ2)=12πσ2exp(θm2σ2),π2<θm<π2,(23)
where the variance σ2 controls the scattering property; larger values of σ2 mean strong forward scattering. This Gaussian approximation is convenient in our case because of the following two reasons.

First, existing phase function models10,12,3538 are those for 3-D scattering, not for 2-D. This means that those functions are normalized for integrals over the unit sphere S2: S2fp(ω)dω=1. Most of the phase functions assume isotropy (rotational symmetry), and hence, the function has a form taking angle θ as an argument; however, ππfp(θ)dθ1. These functions, therefore, are not adequate for our case.

Second, our assumption of layer-wise forward scattering does not allow scattering to happen backwards or sideways, and the Gaussian model is suitable for it. As shown in Fig. 3, the Gaussian model has the form of forward-only scattering (no backwards or sideways) in a reasonable range of σ2, and it is almost normalized; π/2π/2fp(θ,σ2)dθ1. Other 2-D phase functions exist which are not forward-only. For example, Heino et al.62 introduced a 2-D analog of Henyey-Greenstein’s phase function,35 shown in Fig. 3. Although the parameters are different, the two functions in Fig. 3 have similar shapes. The most important difference is that Heino’s function has backward scattering, but our Gaussian model does not. More realistic scattering rather than the layer-wise forward scattering introduced here needs Heino’s or Henyey-Greenstein’s phase function.

Graphic Jump Location
Fig. 3
F3 :

Comparison of two-dimensional phase functions. The upward vertical direction is θ=0, and horizontal directions are θ=±(π/2). (a) Gaussian approximated phase functions with σ2=0.1,0.2,,1.0. The tallest and narrowest shapes correspond to σ2=0.1, and the shape becomes shorter and rounder for larger values of σ2. (b) Heino’s two-dimensional analogs62 of Henyey-Greenstein’s phase function with parameter g=0.1,0.2,,1.0. The tallest and narrowest shapes correspond to g=1.0, and the shape becomes shorter and close to a hemisphere for smaller values of g.

We should note one further simplification in our layer-wise forward scattering model. The angle θm in the phase function is usually defined between θxm1,xm and θxm,xm+1, that is, the difference of directions changed by the scattering event. Instead of dealing with such an exact difference of directions, we use the angle between θxm,xm+1 and the vertical (downward) direction for efficiency of computation. This assumption enables us to discretize the Gaussian phase function much more easily. Since fp(θ) integrates to (approximately) one, such a normalization can be discretized with a sum as follows: Display Formula

π/2π/2fp(θ,σ2)dθbBnfp(θb,σ2)Δθb1,(24)
where B is a set of voxel indices in the next layer n, θb is an alternative form of the corresponding θxm,xm+1, and Δθb is the angle measure as shown in Fig. 4.

Graphic Jump Location
Fig. 4
F4 :

An illustration of angle measure Δθb for voxel b in the next layer. For the center voxel of the upper layer, voxel b (shaded) in the next layer subtends an angle of Δθb, which is used for the angle measure in Eq. (24).

The above equation can be considered as the energy distribution from a voxel in one layer to the voxels in the next layer. For a voxel b at direction θb, the value of fp(θb,σ2)Δθb describes what percentage of the energy will be scattered to this voxel. Figure 5 shows plots of the values corresponding to two phase functions with different parameters. We can see that, due to forward scattering, most of the energy is concentrated in the voxel just below, and a small part goes to the adjacent voxels.

Graphic Jump Location
Fig. 5
F5 :

(a) The phase functions with parameter σ2=0.2 (dashed line) and σ2=0.4 (solid line). Plot of the value fp(θb,σ2)Δθb for each voxel b for (b) σ2=0.2 and (c) σ2=0.4. Note that index b is relative to the voxel in the next layer just below the voxel in consideration. The voxel just below is b=0, the voxel on its right side is b=1, and that on the left side is b=1.

The contribution Hk in Eq. (22) now needs to be rewritten so that it deals with the Gaussian phase function and the discretized energy distribution discussed above. First, we reorder the measure Display Formula

Hk=dA(x0)dθ(xM+1)m=1Mff(xm1,xm,xm+1)dθ(xm)(25)
Display Formula
=dA(x0)dθ(x1)m=1Mff(xm1,xm,xm+1)dθ(xm+1),(26)
and then replace the factors with the Gaussian phase function. Display Formula
Hk=dA(x0)Δθx0,x1σsMm=1Mfp(θxm,xm+1,σ2)Δθxm,xm+1.(27)

Note that the factor dA(x0)Δθx0,x1σsM is common for all paths because we assumed that the grid is uniform so that dA(x0) is constant, and the direction θx0,x1 (or ωx0,x1) is perpendicular to the top surface, and σs is constant.

Observation Model

Suppose the 2-D layered medium is an M×N grid; it has M layers, each of which is made of N voxels. We now construct an observation model of the light transport between a light source and a detector: emitting light to each of the voxels at the top layer, and capturing light from each voxel from the bottom layer. More specifically, let iB1 and jBM be voxel indices of the light source and detector locations, respectively. By restricting the light paths to only those connecting i and j, the observed light Iij is written as follows: Display Formula

Iij=I0k=1NijHijkeσtTDijk,(28)
where Hijk and Dijk are the same as in Eqs. (27) and (21), respectively, but are restricted to paths connecting i and j, and I0=Le(x0,x1) assuming the light source to be constant.

In the above equation, k indexes the light paths, which share the same i and j. Due to the layered scattering model in the N×M grid, the number of different paths between i and j is Nij=NM2. This is, however, too large even for small N and M, e.g., N=M=10. Therefore, we exclude paths having small contributions from the computation. This is done by a simple thresholding while computing Hijk as shown in Algorithm 1. This results in generating fewer paths; NijNM2. For example, there are Nij=742 paths for N=M=20 with σ2=0.4 when th=0.001, which enables us to reduce the computation cost.

Table Grahic Jump Location
Algorithm 1Computing contribution Hijk and omitting low contribution path by thresholding.

Next, we propose a method for the inverse problem of the forward model [Eq. (28)] to estimate the extinction coefficients of the 2-D layered model. As we mentioned before, we fix the light paths and assume that the scattering coefficients and parameters of the Gaussian phase function are uniform and known in advance.

Cost Function

In the M×N 2-D layered medium described in Sec. 3.6, we had assumed a configuration of a light source and detector similar to the left-most one shown in Fig. 6; the light source is located above the medium and the detector is below, and the observed light is Iij, where i,j are the voxel indices of the light source and detector locations. By sliding the light source and the detector, we can obtain N2 observations, resulting in the following least-squares equation: Display Formula

minσtf0,f0=i=1Nj=1N|IijI0k=1NijHijkeσtTDijk|2,(29)
under 2MN constraints Display Formula
0σtu,(30)
where denotes the generalized inequality, i.e., all elements in the vector must satisfy the inequality. The lower bound 0 comes from the fact that any media must have positive extinction coefficients, while the upper bound u is used for numerical stability to exclude unrealistic values to be estimated.

Graphic Jump Location
Fig. 6
F6 :

Four configurations of light sources and detectors. From left to right, we call configurations T2B (top-to-bottom), L2R (left-to-right), B2T (bottom-to-top), and R2L (right-to-left), which represent locations of light sources and detectors.

Furthermore, as shown in Fig. 6, we have four configurations of light sources and detectors by changing their positions. This gives us four different sets of observations Iij and paths ijk. These four different sets lead to four objective functions (fT2B, fL2R, fB2T, fR2L) as shown in Fig. 6. Since the four objective functions share the same variables σt, we can use all of them at the same time by adding them to form a new single function f0 at the expense of additional (factor of four) computation cost. Display Formula

minσtf0,f0=f0T2B+f0L2R+f0B2T+f0R2Lsubject to  0σtu.(31)

Optimization Problem with Inequality Constraints

Since the inverse problem [Eq. (31)] is nonlinear, we employ an interior point method,26 an iterative optimization algorithm for problems with constraints. Here, we first review several key points in optimization; then we will develop an algorithm to solve Eq. (31) along with the required first- and second-order derivatives of the cost function.

Unconstrained problem: Quasi-Newton

First, we review optimization without constraints, which is used inside the interior point method. The general form of unconstrained optimization is Display Formula

minσtf(σt),(32)
where σtRN×M is a real vector and f:  RN×MR is an objective function that is twice continuously differentiable.

To solve it, an iterative procedure begins with an initial guess σt0 and generates a sequence {σtk}k=0. It stops when the change of solutions is small enough. The information about function f at σtk or even previous estimates σt0,σt1,,σtk1 is used to calculate a direction pk to move with a step size αk. A line search is often used to determine the step size by searching along the direction starting from σtk for finding σtk+1 with the least value of the objective function Display Formula

minαk>0f(σtk+αkpk).(33)

Once we find the step size, the estimate σtk+1 is updated as σtk+1σtk+αkpk. The direction is pk=Bkf(σtk) for the Newton’s method, where Bk=2f(σtk)1 is the inverse of the Hessian.

The Newton’s method is well known for its second-order convergence and accuracy. However, when the dimension of the problem is large, calculating the Hessian and its inverse is computationally expensive. Therefore, Quasi-Newton methods are often used, where the inverse Hessian is updated by incremental approximations in order to reduce the computation cost. The Broyden-Fletcher-Goldfarb-Shanno (BFGS) update rules are well known.63Display Formula

s=σtkσtk1,(34)
Display Formula
y=f(σtk)f(σtk1),(35)
Display Formula
Bk=(IsyTyTs)Bk1(IysTyTs)+ssTyTs.(36)

When the conditions yTs>0 and B00 (where 0 means positive definite) are satisfied, the BFGS update guarantees the positive definiteness of Bk. Algorithm 2 shows the Quasi-Newton method.

Table Grahic Jump Location
Algorithm 2The Quasi-Newton method with BFGS update rule.
Constrained problem: interior point

Here we introduce a constrained optimization with inequality constraints of the form Display Formula

minσtf0(σt)subject to  fi(x)0,i=1,,m,(37)
where σtRN×M is a real vector and f0,,fm:  RN×MR are twice continuously differentiable.

The idea is to approximate it as an unconstrained problem. Using Lagrange multipliers, we can first rewrite Eq. (37) as Display Formula

minσtf0(σt)+i=1mI[fi(σt)],(38)
where I:  RR is an indicator function, which keeps the solution inside the feasible region. Display Formula
I(f)={0,f0,f>0.(39)

Equation (38) now has no inequality constraints, while it is not differentiable due to I.

The barrier method26 is an interior point method that introduces a logarithmic barrier function to approximate the indicator function I as follows: Display Formula

I^(f)=(1/t)log(f),(40)
where t>0 is a parameter to adjust the accuracy of approximation. The log barrier function goes to infinity rapidly as f goes close to 0, while it is close to 0 when f is far away from 0. Since I^(f) is differentiable, we have Display Formula
minσtf0(σt)+i=1m(1/t)log[fi(σt)],(41)
or equivalently, Display Formula
minσttf0(σt)i=1mlog[fi(σt)].(42)

The barrier method solves Eq. (42) iteratively by increasing the parameter t. At the limit of t, the above problem coincides with the original problem [Eq. (38)].

Algorithm for Solving the Inverse Problem

Algorithm 3 shows our algorithm, which uses a barrier method with Quasi-Newton for solving the inverse problem. We should mention the following parts where we have modified the original algorithm.26

Table Grahic Jump Location
Algorithm 3Barrier method of interior point with Quasi-Newton solver.

Warm start: For each inner loop, the Quasi-Newton method needs an initial guess of the inverse Hessian B0. Instead of fixing B0 for every inner loop, we reuse the Bk of the last inner loop to accelerate the convergence (shown in lines 4 and 19 in Algorithm 3).

Checking feasibility: Since the Quasi-Newton method and line search estimate without constraints, the next estimate σtk+1 may go beyond the constraints; in our case, each element σtk+1[b] in σtk+1 must be inside [0,u] after the step size has been determined. Therefore, in line 8, we check the feasibility of the estimate σtk+1 for the current step size αk. If it exceeds the boundary of the feasible region, we pull the estimate back into the feasible region by halving the step size. If it is still outside the feasible region, then the step size is halved again. Why do we not just set the step size so that σtk+1 is exactly on the boundary? The reason is the log-barrier: if σtk+1 is on the boundary, in other words, σtk+1[b] is either 0 or u, then log(σt[b]) or log(uσt[b]) becomes infinite, which results in numerical instability. Therefore, the procedure described above is needed.

Checking for positive definiteness: The BFGS update rules guarantee Bk to be positive definite if yTs>0 and B0 are satisfied. While the latter is satisfied by giving an appropriate initial guess, the former depends on the updates at each iteration. If it is not satisfied, then the BFGS updates are no longer valid, and we reset the inverse Hessian Bk to a scaled identity63 at line 16.

Jacobian

Here, we represent the Jacobian of the objective function f0 in Eq. (29). Note that the objective function f0 in Eq. (31) can be derived in the same manner.

We first rewrite the objective function f0 as follows: Display Formula

f0=i=1Nj=1N|IijI0k=1NijHijkeσtTDijk|2(43)
Display Formula
=i=1Nj=1N(Iij22IijI0k=1NijHijkeσtTDijk+I02k=1Nijl=1NijHijkeσtTDijkHijleσtTDijl)(44)
Display Formula
=i=1Nj=1N[Iij22IijI0k=1NijHijkeσtTDijk+I02k=1Nijl=1NijHijkHijleσtT(Dijk+Dijl)],(45)
and the gradient of f0 is given by Display Formula
f0σt=i=1Nj=1N[2IijI0k=1NijHijkeσtTDijkDijkI02k=1Nijl=1NijHijkHijleσtT(Dijk+Dijl)(Dijk+Dijl)].(46)

To simplify the equation, we use the following notation: Display Formula

E=[eσtTDij1eσtTDij2eσtTDijNij],H=[Hij1Hij2HijNij],(47)
Display Formula
Dij=[Dij1Dij2DijNij],D˜ij=[Dij1+Dij1Dij1+Dij2Dij1+DijNijDij2+Dij1Dij2+Dij2Dij2+DijNijDijNij+Dij1DijNij+Dij2DijNij+DijNij].(48)

Now, f0 and the gradient can be represented as Display Formula

f0=i=1Nj=1N[Iij22IijI0ETH+I02(ETH)2],(49)
Display Formula
f0σt=i=1Nj=1N(2IijI0sum[(E×H)Dij]I02sum{[(E×H)(E×H)T]D˜ij}),(50)
where sum[] stands for the sum over the elements of the container [Eq. (49)] of vectors, × is the element-wise product, and denotes the tensor product, defined as Display Formula
A=[a11a12a1ma21a22a2man1an2anm],B=[b11b12b1mb21b22b2mbn1bn2bnm],(51)
Display Formula
AB=[a11b11a12b12a1mb1ma21b21a22b22a2mb2man1bn1an2bn2anmbnm].(52)

In this section, we report the results obtained by numerical simulations using the proposed model.