|
1.IntroductionWith the advance of imaging sensors and microelectronics, multimodality image fusion has emerged as a new and promising research area. With a proper fusion rule, multimodality images are combined into a single composite (i.e., fused image) for human and machine perception or further image processing tasks such as segmentation, feature extraction, and target recognition.1,2 Therefore, an effective fusion rule will improve the quality of a fused image. At present, the most commonly used fusion rules can be divided into two types: the choose-max strategy and the weighted average strategy. For example, Zheng et al.3 combined the multiple sets of low/support value components using the choose-max strategy and the weighted average strategy. In Ref. 4, the detail subbands are combined by the choose-max strategy using a standard deviation measure and the approximation subbands are combined with the weighted average strategy with entropy measure. Li et al.5 applied homogeneity similarity for a multifocus image fusion, in which the absolute-maximum-choosing strategy and the average strategy are employed to fuse the detail and approximation subbands, respectively. Moreover, the choose-max strategies with a fire map of a pulse-coupled neural network are often used to fuse the subband coefficients.6,7 In addition to these pixel-based methods, several researchers argued that the fusion process based on regions is more robust and more easily to expresses the local structural characteristics of objects.8,9 Therefore, many region-based fusion methods have also been proposed in recent years. Correspondingly, the above-mentioned fusion rules have been widely extended into a region-based fusion method. For instance, Piella proposed a choose-max strategy using a local correlation measure to perform fusion of each region.8 In Ref. 10, the priority of a region measured by energy, variance, or entropy of the regional wavelet coefficients is used to weight a region in the fusion process. In Ref. 11, Li and Yang fused the corresponding regions by the choose-max strategy with spatial frequencies (For simplicity, we call the method RSSF.). It is evident that all the above-mentioned image fusion methods used only a single fusion rule to fuse all high/low-frequency subband coefficients or pixels. However, there are different dynamic ranges and correlations in the source image, and a single fusion rule may decrease the contrast of the fused image. A recent trend is multistrategy fusion rules. Examples include the works in Refs. 12 and 13, which employed a similarity index and threshold to distinguish the type of region between source images. Both the weighted average strategy and the choose-max strategy were employed. The former focuses on redundant information and the latter focuses on complementary information. Furthermore, to avoid tuning the threshold, Luo et al.14 used structural similarity (SSIM) to identify the type of region which further guides the selection of the fusion strategy. Although the above multistrategy fusion methods have enhanced the quality of image fusion to some extent, there are still some drawbacks in the fusion process. First, for the redundant regions, when the redundant degree of the regions is large, the weighted average strategy is degenerated into the average strategy. On the contrary, when the redundant degree of regions is small, it reduces to the choose-max strategy. Therefore, the redundant degree of the regions should be considered in the weighted average strategy. Second, the source images endow with not only the local structural characteristics, but also the activity measure of individual coefficients. If the above two aspects are considered in the selection of a multistrategy fusion rule, the performance of the fused image will be more effective. In this paper, an adaptive multistrategy image fusion method is proposed. The whole architecture is constructed with a data plane and a feature plane. In the data plane, source images are decomposed into low-frequency coefficients and high-frequency coefficients with a shift-invariant shearlet transform (SIST). Since the low-frequency coefficients denote only the approximate information, the choose-max strategy with the local average energy is simply adopted to fuse them. Additionally, the high-frequency coefficients represent a lot of detailed information. Thus, a multistrategy fusion rule combined with the choose-max and weighted average strategies is executed to fuse these coefficients. In the feature plane, to incorporate the two strategies, source images are partitioned into several windows and the corresponding features are extracted, then these windows are clustered into regions. Moreover, the dissimilarity of regions and SIST-based activity measures are fed into a sigmoid function to achieve a flexible multistrategy fusion rule. The detailed diagram is depicted in Fig. 3. The method consists of four main stages to: (1) obtain the region map; (2) quantify the characteristics of corresponding regions and distinguish the type of corresponding regions; (3) calculate the activity measure of the SIST coefficients; and (4) connect the fusion strategy selection with the characteristics of corresponding regions and the SIST-based activity measures by a sigmoid function. This paper is the extended version of our recent conference paper in International Conference on Pattern Recognition (ICPR).15 The remainder of this paper is organized as follows. Section 2 briefly reviews the principle of SIST. In Sec. 3, the framework of the proposed method is described. Section 4 explains how to obtain the region map. Section 5 discusses how to quantify the characteristics of corresponding regions and distinguish the type of corresponding regions. The proposed fusion rule is described in Sec. 6. Finally, a discussion of experimental results and conclusions is drawn in Secs. 7 and 8, respectively. 2.Principle of SISTWavelets are very efficient only when dealing with point-wise singularities. In higher dimensions, other types of singularities are usually present or even dominant, and wavelets are unable to handle them very efficiently. In order to overcome this limitation of traditional wavelets, their directional sensitivity has to be increased. SIST is one of the state-of-the-art multiscale decomposition means which has a rich mathematical structure similar to wavelet and the true two-dimensional (2-D) sparse representation for images with edges and a shift-invariance.16,17 The SIST can be completed in two steps: multiscale partition and directional localization. In the first step, the shift-invariance, which means less sensitivity to the image shift, can be achieved by the nonsubsampled pyramid filter scheme in which the Gibbs phenomenon is suppressed to a great extent as a result of replacing down-samplers with convolutions. In the second step, the frequency plane is decomposed into a low-frequency subband and several trapezoidal high-frequency subbands by the shift-invariant shearing filters. To intuitively illustrate the principle of SIST and show its superiority in contrast to stationary wavelet transform (SWT), their two-level SIST and SWT decompositions of the zoneplate image are shown in Figs. 1 and 2, respectively. Here, the basic function of SWT is set as Symlets 4 (sym4) and the SIST parameter determining the number of directions is defined as [2, 3], so the direction numbers for each scale from coarse to fine are 6 and 10. 3.Framework of Proposed MethodSince this paper focuses on the research of image fusion, we assume that the source images to be fused have been geometrically registered. The system diagram of the proposed method is depicted in Fig. 3. The general procedure of the proposed method can be divided into a feature plane and a data plane. In the feature plane, the procedure is summarized as follows:
In the data plane, the procedure is summarized as follows:
In the following subsections, the proposed algorithm is explained in detail. 4.Generation of Region MapIn this section, we explain how to generate the region map. First, the source images are divided into windows and then the features of the windows are extracted. Second, the feature vectors are constructed in the feature difference space. Third, the region map is obtained by clustering the feature vectors using FCM. 4.1.Feature ExtractionIn the process of feature-level image fusion, an important step is to extract features from the source images. Since the sharpness and edges of an image can be represented by the high-frequency coefficients, the features of windows are extracted not only from the pixels of a window but also the high-frequency coefficients of a window. The source images A and B are decomposed in the first level with six high-frequency subbands and . A, B, , and are divided into windows and . In this paper, the variance18 (), gradient6 (), and the gray scale () of 19 are selected as the features of the spatial domain. The variance18 () and energy18 () of are extracted as the features of the SIST domain. These features are defined as follows: 4.1.1.VarianceThe variance reflects the relative degree of dispersion between the pixels in a window . is defined as the pixel intensity located at and denotes the variance of window , therefore, we have The calculation of is similar to . 4.1.3.Gray scale of windowThe gray scale feature of the window is extracted by two dimensional principal component analysis (2-DPCA), which was proposed by Yang et al.19 In contrast with the PCA method, 2-DPCA can avoid reshaping the image window into an image vector so that the window structure is kept and the computational complexity is significantly reduced. For the window set , the mean window matrix is defined as The covariance matrix of the window set is defined as The projection matrix is formed by the eigenvectors corresponding to the first largest eigenvalues of . The window set is projected into eigenspace by is the gray scale feature of window . 4.1.4.Energyand are defined as the energy of SIST coefficients in the window and the SIST coefficient located at , then we have Next, we use the feature differences of the source images to construct feature vectors Each feature vector consists of 15 feature differences Let be the final normalized feature vectors. 4.2.Regions Segmentation Using FCMThe feature vectors can reflect the characteristics between source images. To intuitively illustrate this fact, Fig. 4 shows histograms of and DG from multifocus clock images [Figs. 4(a) and 4(b)]. As can be seen in Figs. 4(c) and 4(d), the distribution of the histograms is continuous, which means that and DG have the ability to reflect the redundancy between source images. The histograms of other feature differences have the same properties. Considering the space limitation, these histograms are omitted. Due to the local correlation of source images, the characteristics of a source are region related. Therefore, the following problem is to partition the feature vectors into clusters. Considering the fact that fuzzy clustering techniques have been effectively used in image processing, the FCM algorithm,20 a well-known fuzzy clustering method, is adopted to cluster the feature vectors in this paper. Assume that the number of clusters is . The membership of the feature vector in the ’th cluster is labeled as . If , then belongs to the ’th cluster. As a result, the feature vectors are partitioned into groups, i.e., the region map . In this paper, let the number of clusters be 10 and let the size of window be by the cross-validation technique. The discussions about the tuning of the parameters are given in Sec. 7.3. 5.Quantization of Region CharacteristicsThe characteristic of corresponding regions can be quantified by the dissimilarity, which reflects the complementarity. In theory, the feature vector of fully redundant corresponding windows should be . Therefore, the dissimilarity of the corresponding region can be calculated as where represents the number of feature vector of region . The higher the values of , the higher the complementarity is. To identify the redundancy or complementarity of a corresponding region, the complementary seed region is defined as the region with the largest value () and the redundant seed region is defined as the region with zero feature vectors (). Thus, the dissimilarity of the redundant seed region is 0 (). Let . If is normalized, then and (). Denoting as the distances of to and as the distances of to , thenAll distances can be denoted as According to the comparison of and , all regions are labeled as near- or near- Then the regions are divided into complementary parts () and redundant parts (), where . Figure 4(e) shows an example of the region map between Figs. 4(a) and 4(b), in which the regional attributes are represented using the different gray scale (, ). The points with stronger brightness express more dissimilarity (i.e., complementarity). 6.Proposed Fusion RuleGenerally speaking, the purpose of image fusion is to preserve all useful information in the source images. In this section, we first briefly review the related knowledge then the details of our proposed multistrategy fusion rule are given. 6.1.Related Knowledge6.1.1.Commonly used fusion strategyThe multistrategy fusion rule includes two commonly used fusion strategies, i.e., the choose-max fusion strategy and the weighted average fusion strategy, which can be described using Eqs. (21) and (22), respectively. The choose-max fusion strategy can be written as where is the fused coefficient located at and and are the coefficients of source images located at . and are the activity measures of and . The salient feature of the coefficient (e.g., variance or gradient) is expressed by the so-called activity. The coefficient with a higher activity measure contains richer information than the others. It should be directly selected as the fused coefficient.The weighted average fusion strategy can be written as where the weighted factors and are calculated according to the specific task.In general, when the source images are complementary, the choose-max strategy should be applied. Otherwise, the weighted average strategy should be employed. 6.1.2.Sigmoid functionA sigmoid function is a mathematical function having an “S” shape (sigmoid curve),18 which is shown in Fig. 5 and is defined as where is the shrink factor and is the variable; they jointly control the shape of the sigmoid curve.We plot the sigmoid function Sf with different shrink factors and in Fig. 5. The shrink factor controls the steepness of the sigmoid curve. There is a pair of horizontal asymptotes as . For the same , when is very large or very small, Sf approaches or . However, when is closer to 1, Sf approaches . For the same , when , Sf is equivalent to , where is the sign function. When , Sf is equivalent to 0.5. As can be seen from Fig. 5, the sigmoid function plays two roles: the selection role and the weighted average role, which are determined by and . This phenomenon is exactly in line with the choose-max strategy and the weighted average strategy. Moreover, and can be represented by the characteristic of the corresponding regions and the difference of activity measures of the corresponding coefficients. Therefore, using the sigmoid function to design a multistrategy fusion rule is appropriate. 6.2.Fusion of the Low-Frequency SubbandsThe energy of the source image focuses on the low frequency part and the adjacent coefficients of the source image retain local correlation. Therefore, to get a high-contrast outcome, the low-frequency subbands are fused by the choose-max strategy with the local average energy in the proposed method where denotes the fused low-frequency subband coefficient located at , and are the low-frequency subband coefficients of the source images located at . and are the local average energies of and , which can be calculated by Eq. (25). is the size of the local window. Our purpose is to maximally preserve the useful information of the source images.6.3.Fusion of the High-Frequency SubbandsSince high-frequency subbands tend to contain a lot of image details (such as edges, area boundaries, and so on), the quality of the fusion rule for the high-frequency subband coefficients will obviously affect the fused result. To enhance the quality of the fused image, an adaptive multistrategy fusion rule with a sigmoid function is designed as follows: where represents the fused high-frequency subband coefficient located at of the ’th level and ’th high-frequency subband for the ’th region. and have similar meanings. The weight is calculated by the sigmoid function, where is the shrink factor and is the variable. To achieve the adaptive multistrategy fusion rule, and should be image dependent.Considering the local correlation of pixels, the shrink factor based on the region is defined as the characteristics of the corresponding regions . Let and represent the activity measures of the high-frequency coefficients located at of the ’th level and the ’th high-frequency subband for the ’th region of images A and B, respectively. Thus, their ratio represents the difference of activity measures of the corresponding coefficients. and determine the role of the combining strategy. We discuss their details in the following subsections. 6.3.1.Discussion of theWith the same , the role of the sigmoid function is determined by , which controls the steepness of the sigmoid curve. The selection of a strategy depends on the characteristic of the corresponding regions, therefore, we establish its connection with by , , where is a positive parameter and is fixed at 100 in the proposed method. When , it means that the corresponding regions are complementary and the choose-max strategy should be adopted for these regions. This status is equivalent to the sigmoid function with . When , it means that the corresponding regions are redundant and the weighted average strategy should be employed. This status is equivalent to the sigmoid function with the as Eq. (28).We then plot the relationship between and in Fig. 6. As shown in Fig. 6, the value of increases with the increase of . Moreover, we plot the sigmoid function with different a in Fig. 7. As can be seen from Fig. 7, for the redundant corresponding regions, with the increase of , the sigmoid function becomes steeper. This change means that the fusion rule is gradually transformed from the weighted average strategy to the choose-max strategy. In this way, not only the type of region (complementarity or redundancy) but also the redundant degree is considered. 6.3.2.Calculation ofWith the same , the variable also affects the selection of the fusion strategy. According to the previous discussion, the difference of activity measures of the corresponding high-frequency coefficients plays an important role in the strategy selection. Then, the can be calculated as follows: where is the size of the local window. and are the mean and variance of the local window centered at . is the averaged variance of the local window, which takes into account the neighbor dependency and can be considered as the activity measure of . Their ratio represents the difference of the corresponding coefficients. When the difference is large, it means that the corresponding coefficients are complementary and the choose-max strategy should be adopted. This status is equivalent to the sigmoid function with a very large or very little . When the difference is little, it means that the corresponding coefficients are redundant and the weighted average strategy should be employed. This status is equivalent to the sigmoid function with the near 1. This phenomenon can be seen from Fig. 7.7.Experimental Results and AnalysisTo verify the proposed method, it is compared with conventional and state-of-the-art fusion methods from the standpoint of visual perception and objective evaluation. The six compared fusion approaches are average-based (AVE-based) approach, PCA-based approach,1 SIST-based approach,2,16 RSSF,11 image fusion incorporating Gabor filters and fuzzy c-means clustering (For simplicity, we call the method GFFC.),12 and RF_SSIM.14 The choice of these algorithms is motivated by the following reasons: the typical fusion, region-based single strategy fusion, region-based multistrategy fusion, recent fusion methods, and easy reproducibility. Moreover, to demonstrate the superiority of the adaptive , two experiments with the fixed (, 80) are executed. Furthermore, to test the superiority of SIST, the proposed method is compared with that using SWT and nonsubsampling contourlet transform (NSCT). The quantitative comparison of different fusion algorithms should consider the following aspects: the edge intensity, the amount of information of the fused image, and the relationship between the source images and the fused images. Thus, the following image quality metrics (IQM)21–25 are used in this paper:
For the SIST-based method and the proposed method, the pyramid filter of SIST is set as “maxflat (maximally flat filter)” and the source images are decomposed into four levels with the number of directions 6, 6, 10, and 10. Furthermore, the number of clusters , the size of the window and the sliding window size are applied to the proposed method. The effects of parameters on the proposed method are discussed in Sec. 7.3. 7.1.Experiment 1: Performance Evaluation of the Proposed Fusion Method7.1.1.Fusion results of multifocus imagesBecause the optical imaging cameras cannot capture objects at various distances all in focus, the multifocus images are fused to get a fused image with the focused parts of all the multifocus images. Figures 4(a) and 4(b) are a pair of multifocus images which have a common scene structure. Figure 8 and Table 1 show the fused images and evaluation results, and Fig. 9 shows the detailed blocks (the “6” in the right clock) of the source images and the fusion results. Moreover, for a clearer comparison, the subtractions between the parts of Fig. 4(b) and the parts of the fusion results are also illustrated in Fig. 9. As shown in Fig. 9, Figs. 9(c) and 9(e) have a lot of residual information, which means that the traditional fusion methods (the AVE-based method and the PCA-based method) reduce contrast. This results from the fact that the AVE-based method just takes the pixel-by-pixel gray level average of the source images and the PCA-based method uses the weighted average rule with the eigenvalues of the covariance of the source images. Figures 8(c) and 8(f) are the fusion results of the SIST-based approach and RSSF. These methods employ a single fusion strategy. Although there is almost no residual information in Fig. 9(m), some incorrect fusion blocks are found in the rectangle region of Fig. 8(f). The reason for this lies in the fact that some incorrect segmentations worsen the region-based choose-max fusion rule. For example, as shown in Fig. 8(e), some pixels located at the upper boundary of the right clock are segmented into different regions. Figure 8(c) is the fused result of the SIST-based fusion approach, which adopts a coefficient-based choose-max fusion rule. Figure 9(g) shows that some information of Fig. 4(b) is lost in the fusion process. The fusion results of multistrategy fusion methods (GFFC and RF_SSIM) are shown in Figs. 8(d) and 8(h). As is evident, the contrast of the fused images is reduced, especially in Fig. 8(h). This is further illustrated by Figs. 9(i) and 9(k). This results because their multistrategy fusion rules ignore the activity difference among the coefficients in a region. As can be seen from Fig. 8(k), almost all the useful information of the source images has been transferred into the fused image by our proposed method. Moreover, we can see from Fig. 9(o) that it is almost black; this means that the residual image between Figs. 9(a) and 9(n) is very small. This observation further illustrates the information integration ability of the proposed method. Table 1Objective comparison of fused results with different methods for multifocus images.
Note: The bold values represent the best results. Considering the similar fused results of Figs. 8(i)–8(k), objective evaluations are performed and the results are listed in Table 1. It can be concluded that most of the objective evaluation results are in reasonable agreement with the visual effect. For example, the low contrasts of Figs. 8(a), 8(b), and 8(h) are quantified by the lagging indicators (, EI, and Qabf). The fused image of AVE, PCA, RSSF, and RF_SSIM obtained better results in terms of MI due to being performed in the spatial domain. RSSF directly copies original regions from the source images to the result image, so the pixel distributions of Fig. 8(f) are changed very little. The best RCE of the RSSF confirms this observation. By comparison, we can conclude that our proposed method outperforms the other methods using a regional activity measure based on a multistrategy fusion rule (GFFC and RF_SSIM). In addition, the fixed (, 80) and the adaptive are compared. As can be seen from the last three rows of Table 1, since an adaptive makes the fusion strategy flexible, our proposed method with an adaptive achieves higher performances than those with a fixed . Overall, although the result of the proposed method is slightly inferior to that of PCA in terms of MI and that of RSSF in terms of RCE, the result is superior to that of other methods in other terms. This means that the result image fused by our method contains more details and a greater amount of information. 7.1.2.Fusion results of medical imagesIt is easy for physicians to understand the lesion by reading images of different modalities with a multimodal medical image fusion. For example, fused magnetic resonance imaging/computed tomography (MRI/CT) imaging can concurrently visualize anatomical and physiological characteristics of the human body for diagnosis and treatment planning.26 In this section, the fused results of MRI/CT are shown in Fig. 10. It is not hard to see that the MRI [Fig. 10(a)] and CT [Fig. 10(b)] are obviously complementary, but the MRI contains more information. By subjective evaluation, the result of the AVE-based method averaging the source images [Fig. 10(c)] leads to low contrast. A large amount of information exists in MRI. The weight with the eigenvalues of the covariance of the source images is biased toward MRI, thus the fused image of the PCA-based method [Fig. 10(d)] loses the information of the CT. Owing to the single choose-max strategy and the region-based activity measure, as shown in Fig. 10(h), RSSF is inclined to choose inappropriate regions. Due to the region based activity measure, the fusion result of RF_SSIM(a multistrategy fusion method) lacks the contrast. However, because our method adopts an adaptive multistrategy fusion rule which is based on local structural characteristics and a coefficient-based activity measure, our result achieves better visual perception. The objective evaluations are listed in Table 2. It can be observed that the PCA method wins in terms of MI and IQM. This is because the PCA method is biased toward MRI and the amount of information of the MRI is far greater than that of the CT. The low contrast fused images of AVE, GFFC, and RF_SSIM yield worse performances in terms of , EI, and Qabf; this fact is in line with the visual perception. Our proposed method has the best value in terms of the other four indices (i.e., , EI, EN, and Qabf) among all seven indices. Table 2Objective comparison of fused results with different methods for medical images.
Note: The bold values represent the best results. 7.1.3.Fusion results of infrared-visual imagesUsually, visible images can provide spatial details of the background of the objects but cannot reveal the objects. However, infrared images can capture the objects but fail to reveal some background areas. The fused image displays both the objects and spatial details of the background, which is a potential solution for improving target detection. In Fig. 11, we show an example of the visual-infrared images. The luminance of the object (the person) in Figs. 11(c)–11(f) and 11(j) decreases compared with the infrared image Fig. 11(a). In Fig. 11(d), the object is changed into black from white. This change is disadvantageous to subsequent object processing. Although Fig. 11(i) preserves the luminance of the object and obtains the best values in terms of MI and Qabf, there are some apparent image stitches which seriously affect the visual perception. Obviously, our fused image has a better visual perception than the others. Moreover, our method also outperforms the others in terms of , EI, and EN, which are given in Table 3. For the MI, our method gets the largest value except for those of PCA and RSSF. Furthermore, the objective evaluation confirms that the proposed method with an adaptive is superior to the method with a fixed in the fusion of infrared-visual images. Table 3Objective comparison of fused results with different methods for infrared and visual images.
Note: The bold values represent the best results. It should also be noticed that the improved performance of the proposed method is at the cost of an increasing computation complexity. As shown in the last column of Table 3, the consumed time of the proposed method is more than those of the other fusion methods. The increased time is mainly the result of the computation of the region map and the activity measure. This may limit its application in some real-time cases. 7.1.4.Fusion results of other imagesTo further evaluate the proposed method’s robustness, a series of image fusion experiments are performed on five pairs of source images. Considering the limitation of space, Fig. 12 only shows the source images, the region maps, and the fused results of the proposed method. Moreover, in Table 4, only the fused results obtained by GFFC, RSSF, RF_SSIM, and the proposed method with and are given. Since the evaluation indices MI, RCE, and Qabf depend on the consistency of gray distribution or the pixel values between the fused images and source images, the simple copying source region method RSSF wins in term of these indices. However, the fused result of RSSF often contains apparent image stitches, which seriously affect the visual perception. From the other four metrics, the proposed method provides the best objective data, which illustrates that more details from the source images are retained. It has been verified that the proposed adaptive multistrategy fusion rule is more beneficial for fusion results than these methods, including the single fusion strategy (RSSF), multistrategy fusion rules based on regional activity measure (GFFC, RF_SSIM), and our method with a fixed (, 80). Table 4Objective comparison of fused results with different methods for the other five kinds of images.
Note: The bold values represent the best results. 7.2.Experiment 2: Advantages of SIST Over SWT and NSCTTo verify the superiority of SIST, the proposed method is evaluated in different transform domains [SWT and nonsubsampled contourlet transform (NSCT)] on the multifocus clock images [Fig. 4(a) and 4(b)] and MRI/CT images [Figs. 10(a) and 10(b)]. In our experiments, the images are all decomposed into four levels by SWT (with the basic function of sym4), NSCT, and SIST. The decomposition level of NSCT is set as [2, 2, 3, 3]. The pyramid filter “9-7” (Gaussian or Laplacian pyramid decomposition filter) and the directional filter “pkva” (directional filter banks decomposition filter) are selected in the NSCT domain. Figures 8(k) and 10(n) show the fusion results of the proposed method using SIST in Figs. 4(a), 4(b), 10(a), and 10(b). Figure 13 shows the fusion results of the proposed method using SWT and NSCT in Figs. 4(a), 4(b), 10(a), and 10(b). Tables 5 and 6 provide an objective comparison among different transformation domains. The results presented in these tables show that, in general, the proposed fusion method has the best performance in the SIST domain. This is mainly because the SIST transform does not restrict the number of directions for the shearing and its computation is more efficient than NSCT and SWT. Table 5The advantages of using SIST versus SWT and NSCT [Figs. 4(a) and 4(b)].
Note: The bold values represent the best results. Table 6The advantages of using SIST versus SWT and NSCT [Figs. 10(a) and 10(b)].
Note: The bold values represent the best results. 7.3.Experiment 3: the Effects of Parameters on the Proposed Method7.3.1.Effect of the number of clusters on the proposed methodIn this experiment, the effect of the number of clusters on the performance of the proposed method is investigated. Due to space limitation, only Figs. 11(a), 11(b), 12(a2), and 12(b2) are given for examples. The reason is that they represent two typical kinds of source images: Figs. 11(a) and 11(b) represent the images with different clarity and different information, Figs. 12(a2) and 12(b2) represent the same information but with different clarity. Let , and the number of clusters varies from 2 to 50, the corresponding EN, RCE, and Qabf performance metrics are plotted in Figs. 14(a), 14(b), and 14(c), respectively. As the number of clusters increases (), the EN and RCE values of the two fused images and the Qabf values of Figs. 12(a2) and 12(b2) only vary by a small amount. When the number of clusters is 10, the Qabf values of Figs. 11(a) and 11(b) achieve their largest values. However, the time consumption increases with the increase of the number of clusters. To balance the time consumption and the quality of image fusion, the number of clusters is set as 10 in our proposed method. 7.3.2.Effect of window size on the proposed methodLet the window size vary from 4 to 30 under the conditions of and . Correspondingly, the values of EN, RCE, and Qabf are plotted in Figs. 15(a), 15(b), and 15(c), respectively. As can be seen, for Figs. 12(a2) and 12(b2), the values of EN, RCE, and Qabf are stable with the increase of window size. For Figs. 10(a) and 10(b), the EN value (from 6 to 30) and the Qabf value (from 4 to 30) vary slightly. However, the RCE value is volatile. By the statistical analysis of Figs. 11(a) and 11(b), we can find that the EN metric has the largest value and the RCE metric shows better fusion properties when the window size of Figs. 10(a) and 10(b) is 4. Thus, a window size of is a reasonable selection. 7.3.3.Effect of sliding window size on the proposed methodThis experiment explores the effect of sliding window size on the performance of the proposed method. With the fixed window size and , the sliding window size varies from 3 to 23 in Fig. 16. As can be seen from Fig. 16, the variation of the sliding window size has little effect on EN, RCE, and Qabf for Figs. 12(a2) and 12(b2). For Figs. 11(a) and 11(b), the metrics have a small fluctuation with the increase of the sliding window size and the best performance is achieved with the condition of . Therefore, the sliding window size in our proposed method is assigned as 5. 8.ConclusionIn this paper, an adaptive multistrategy image fusion method has been proposed. A multiscale image decomposition tool and a multistrategy fusion rule are the two key components in the proposed method. The SIST is adopted as the multiscale analysis tool in the proposed image and the source images are decomposed into low-frequency subbands and high-frequency subbands. The choose-max fusion strategy is employed to fuse the low-frequency subbands which contain the approximate information of the source images. An adaptive multistrategy fusion rule with a sigmoid function has been proposed to distinguish the attributes of the high-frequency subbands and can fuse them automatically. The dissimilarity of corresponding regions and the activity measure of the high-frequency coefficient are employed to identify the attributes between the high-frequency coefficients. These are used as the variables of the sigmoid function. They determine the curve of the sigmoid function with different steepnesses, which correspond to the different fusion strategies. By using the sigmoid function, the adaptive selection of the fusion strategy is achieved. Several sets of experimental results demonstrate the validity, flexibility, and generality of the proposed method in terms of both visual quality and objective evaluation. It should be noted that although the proposed method achieves better results, the computational complexity is a bit high since several techniques have been incorporated into the proposed method. Thus, how to optimize time consumption will be one of our future works. The other future work is to investigate more effective functions for adaptive fusion strategy selection. AcknowledgmentsThis work was supported by the National Natural Science Foundation of China under Grant Nos. 61103128 and 61373055, the China Postdoctoral Science Foundation under Grant No. 2013M541601, and Postdoctoral Research Funds of Jiangsu province under Grant No. 1301079C. The authors would like to thank the anonymous reviewers for their detailed reviews, valuable comments, and constructive suggestions. ReferencesH. YesouY. BesnusJ. Rolet,
“Extraction of spectral information from landsat TM data and merger with SPOT panchromatic imagery-A contribution to the study of geological structures,”
ISPRS J. Photogramm. Remote Sens., 48
(5), 23
–26
(1993). http://dx.doi.org/10.1016/0924-2716(93)90069-Y IRSEE9 0924-2716 Google Scholar
H. LiS. ManjunathS. Mitra,
“Multi sensor image fusion using the wavelet transform,”
Graph. Models Image Process., 57
(3), 235
–245
(1995). http://dx.doi.org/10.1006/gmip.1995.1022 CGMPE5 1049-9652 Google Scholar
S. Zhenget al.,
“Multisource image fusion method using support value transform,”
IEEE Trans. Image Process., 16
(7), 1831
–1840
(2007). http://dx.doi.org/10.1109/TIP.2007.896687 IIPRE4 1057-7149 Google Scholar
J. TianL. Chen,
“Adaptive multi-focus image fusion using a wavelet-based statistical sharpness measure,”
Signal Process., 92
(9), 2137
–2146
(2012). http://dx.doi.org/10.1016/j.sigpro.2012.01.027 SPRODR 0165-1684 Google Scholar
H. Liet al.,
“Multifocus image fusion and denoising scheme based on homogeneity similarity,”
Opt. Commun., 285
(2), 91
–100
(2012). http://dx.doi.org/10.1016/j.optcom.2011.08.078 OPCOB8 0030-4018 Google Scholar
C. ShiQ. G. MiaoP. F. Xu,
“A novel algorithm of remote sensing image fusion based on shearlets and PCNN,”
Neurocomputing, 117 47
–53
(2013). http://dx.doi.org/10.1016/j.neucom.2012.10.025 NRCGEO 0925-2312 Google Scholar
G. S. El-taweelA. K. Helmy,
“Image fusion scheme based on modified dual pulse coupled neural network,”
IET Image Process., 7
(5), 407
–414
(2013). http://dx.doi.org/10.1049/iet-ipr.2013.0045 1751-9659 Google Scholar
G. Piella,
“A general framework for multiresolution image fusion: from pixels to regions,”
Inf. Fusion, 4
(4), 259
–280
(2003). http://dx.doi.org/10.1016/S1566-2535(03)00046-0 1566-2535 Google Scholar
Z. Wanget al.,
“Image quality assessment: from error visibility to structural similarity,”
IEEE Trans. Image Process., 13
(4), 600
–612
(2004). http://dx.doi.org/10.1109/TIP.2003.819861 IIPRE4 1057-7149 Google Scholar
T. ChenJ. P. ZhangY. Zhang,
“Remote sensing image fusion based on ridgelet transform,”
in Proc. of Int. Conf. on Geoscience and Remote Sensing Symposium,
1150
–1153
(2005). Google Scholar
S. LiB. Yang,
“Multifocus image fusion using region segmentation and spatial frequency,”
Image Vision Comput., 26
(7), 971
–979
(2008). http://dx.doi.org/10.1016/j.imavis.2007.10.012 IVCODK 0262-8856 Google Scholar
X. J. WuD. X. SuX. Q. Luo,
“A new similarity function for region based image fusion incorporating Gabor filters and fuzzy c-means clustering,”
Proc. SPIE, 6625 66250Z
(2007). http://dx.doi.org/10.1117/12.791022 PSISDG 0277-786X Google Scholar
Q. Zhanget al.,
“Similarity-based multimodality image fusion with shiftable complex directional pyramid,”
Pattern Recogn. Lett., 32
(13), 1544
–1553
(2011). http://dx.doi.org/10.1016/j.patrec.2011.06.002 PRLEDG 0167-8655 Google Scholar
X. Y. LuoJ. ZhangQ. H. Dai,
“A region image fusion based on similarity characteristics,”
Signal Process., 92
(5), 1268
–1280
(2012). http://dx.doi.org/10.1016/j.sigpro.2011.11.021 SPRODR 0165-1684 Google Scholar
X. LuoZ. ZhangX. Wu,
“Image fusion using region segmentation and sigmoid function,”
in Proc. of the 22nd Int. Conf. on Pattern Recognition (ICPR),
1049
–1054
(2014). Google Scholar
L. WangB. LiL. Tian,
“Multi-modal medical image fusion using the inter-scale and intra-scale dependencies between image shift-invariant shearlet coefficients,”
Inf. Fusion, 19
(9), 20
–28
(2014). http://dx.doi.org/10.1016/j.inffus.2012.03.002 1566-2535 Google Scholar
G. EasleyD. LabateW. Q. Lim,
“Sparse directional image representations using the discrete shearlet transform,”
Appl. Comput. Harmon. A, 25
(1), 25
–46
(2008). http://dx.doi.org/10.1016/j.acha.2007.09.003 ACOHE9 1063-5203 Google Scholar
J. L. Lianget al.,
“Image fusion using higher order singular value decomposition,”
IEEE Trans. Image Process., 21
(5), 2898
–2909
(2012). http://dx.doi.org/10.1109/TIP.2012.2183140 IIPRE4 1057-7149 Google Scholar
J. Yanget al.,
“Two-dimensional PCA: a new approach to appearance-based face representation and recognition,”
IEEE Trans. Pattern Anal. Mach. Intell., 26
(1), 131
–137
(2004). http://dx.doi.org/10.1109/TPAMI.2004.1261097 ITPIDJ 0162-8828 Google Scholar
R. P. NikhilP. KuhuM. K. James,
“A possibilistic fuzzy c-means clustering algorithm,”
IEEE T. Fuzzy Syst., 13
(4), 517
–530
(2005). http://dx.doi.org/10.1109/TFUZZ.2004.840099 IEFSEV 1063-6706 Google Scholar
J. SaeediK. Faez,
“Fisher classifier and fuzzy logic based multi-focus image fusion,”
in Proc. of IEEE Int. Conf. on Intelligent Computing and Intelligent Systems,
420
–425
(2009). Google Scholar
G. QuD. ZhangP. Yan,
“Information measure for performance of image fusion,”
Electron. Lett., 38
(7), 313
–315
(2002). http://dx.doi.org/10.1049/el:20020212 ELLEAK 0013-5194 Google Scholar
C. S. XydeasV. Petrovic,
“Objective image fusion performance measure,”
Electron. Lett., 36
(4), 308
–309
(2000). http://dx.doi.org/10.1049/el:20000267 ELLEAK 0013-5194 Google Scholar
Z. WangA. Bovik,
“A universal image quality index,”
IEEE Signal Process. Lett., 9
(3), 81
–84
(2002). http://dx.doi.org/10.1109/97.995823 IESPEJ 1070-9908 Google Scholar
X.-Q. LuoX.-J. Wu,
“A new metric of image fusion based on region similarity,”
Opt. Eng., 49
(4), 047006
(2010). http://dx.doi.org/10.1117/1.3394086 OPEGAR 0091-3286 Google Scholar
A. PoloF. CattaniA. Vavassori,
“MR and CT image fusion for post implant analysis in permanent prostate seed implants,”
Int. J. Radiat. Oncol. Biol. Phys., 60
(5), 1572
–1579
(2004). http://dx.doi.org/10.1016/j.ijrobp.2004.08.033 IOBPD3 0360-3016 Google Scholar
BiographyXiao-qing Luo received a PhD degree in pattern recognition and intelligent systems from Jiangnan University, Wuxi, China, in 2010. She currently teaches in the School of Internet of Things as an associate professor of Jiangnan University. Her current research interests are image fusion, pattern recognition, and other problems in image technologies. She has published more than 40 technical articles in these areas. Zhan-cheng Zhang received a PhD degree from the School of Information Technology, Jiangnan University, in 2011. From 2012 to 2013, he was a postdoctoral with the Chinese Academy of Sciences. Since 2014, he has been an assistant professor with the College of Electronic and Information Engineering, Suzhou University of Science and Technology. He is the author of 30 articles and holds five patents. His research interests include pattern recognition and image fusion. Xiao-jun Wu received a PhD degree in pattern recognition and intelligent systems from Nanjing University of Science and Technology, Nanjing, China, in 2002. He joined the School of Information Engineering (now renamed as School of IoT Engineering), Jiangnan University, in 2006, where he is a professor. His current research interests are pattern recognition, computer vision, fuzzy systems, neural networks, and intelligent systems. He has published more than 150 papers in his fields of research. |