Скачать 95.46 Kb.

An Image Change Detection Algorithm Based on Markov Random Field Models T. Kasetkasem and P.K. VarshneyDepartment of Electrical Engineering and Computer Science Syracuse University Syracuse, NY 13244Email: tkasetka@syr.edu and varshney@syr.edu Abstract – This paper addresses the problem of image change detection based on Markov random field (MRF) models. MRF has long been recognized as an accurate model to describe a variety of image characteristics. Here, we use the MRF to model both noiseless images obtained from the actual scene and change images whose sites indicate changes between a pair of observed images. The optimum image change detection algorithm under the maximum a posteriori (MAP) criterion is developed under this model. Examples are presented for illustration and performance evaluation.
The ability to detect changes that quantify temporal effects using multitemporal imagery provides a fundamental image analysis tool in many diverse applications. Due to the large amount of available data and extensive computational requirements, there is a need for change detection algorithms that automatically compare two images taken from the same area at different times, and determine the locations of changes. Usually, in the comparison process [14], differences between two corresponding pixels belonging to the same location for an image pair are determined based on some quantitative measure. Then, a change is labeled, if this difference measure exceeds a predefined threshold, and no change is labeled, otherwise. Most of the comparison techniques described in [1] only consider information contained within a pixel even though intensity levels of neighboring pixels of images are known to have significant correlation. Also, changes are more likely to occur in connected regions rather than at disjoint points. By using these facts, a more accurate change detection algorithm can be developed. To accomplish this, a Markov random field (MRF) model for images is employed in this paper so that statistical correlation of intensity levels among neighboring pixels can be exploited. MRF has long been recognized as an accurate model to describe a variety of image characteristics such as texture. Under this model [58], the configuration (intensity level) of a site (pixel) is assumed to be statistically independent of configurations of all remaining sites excluding itself and its neighboring sites when configurations of its neighboring sites are given. In other words, the configuration of a pixel given the configurations of the rest of the image is the same as the configuration of a pixel given the configurations of its neighboring pixels. Furthermore, the MRF is known to be equivalent to the Gibbs field [6] whose probability density function (PDF) is given by (1) where S is a set of sites contained within an image, x is a vector of configurations (intensity levels) over S, is a normalizing constant, and V_{C} is a Gibbs potential function. A Gibbs potential function is a function of configurations and cliques. A clique [56], denoted by C, is defined as a set of sites whose elements are mutual neighbors. Studies reported in [910] have tried to employ the MRF model for image change detection (ICD). In [9], one image is subtracted from the other, pixel by pixel, and two thresholds (one low and one high) are selected. If the difference intensity level of a pixel is lower than the low threshold, then this pixel is put in the absolute unchanged class. If the intensity level is greater than the high threshold, the corresponding pixel is put in the absolute changed class. The remaining pixels whose difference intensity levels are between these two thresholds are subjected to further processing where the spatialcontextual information based on the MRF model is considered. A similar approach can be found in [10]. Again, this algorithm can be divided into two parts. In the first part, a pixelbased algorithm [1] determines an initial change image that is further refined based on the MRF model in the second part. Some information is lost while obtaining the initial change image since the observed data is projected into a binary image whose intensity levels represent change or no change. We observe that studies in [910] do not fully utilize all the information contained in images, and moreover, the preservation of MRF properties is not guaranteed. In [11], the effect of image transformations on images that can be modeled by MRFs is studied. It has been shown that MRF properties may not hold after many transformations such as resizing of an image and subtraction of one image from another. For some specific transformations, MRF properties are preserved, but a new set of potential functions must be obtained. Since a difference image can be looked upon as a transformation, MRF modeling of a difference image in [9] and initial change image in [10] may not be valid. This provides the motivation for the development of an ICD algorithm that uses additional information available from the image and preserves MRF properties. Here, we develop an ICD algorithm that consists of only one part. The observed images modeled as MRFs are directly processed by the MAP detector which searches for the global optimum. The detector based on the MAP criterion chooses the most likely change image among all possible change images given the observed images. The resulting probability of error is minimum among all other detectors [1213]. The structure of the MAP detector is based on statistical models. Therefore, the accuracy of the statistical models for the given images as well as for the change image is crucial. In this paper, we assume that the given images are obtained by the summation of noiseless images (NIM) and image noises. Both the NIMs and the change image are assumed to have MRF properties. Furthermore, we assume that configurations of the NIMs are equal for unchanged sites. In addition, the configurations of changed sites from one NIM are independent of configurations of changed sites from the other NIM when the configurations of unchanged sites are given. Then, the a posteriori probability is determined based on the above assumptions, and the MAP criterion is used to select the optimum change image. Due to the complexity of the a posteriori probability computation, the solution of the MAP detection problem cannot be obtained directly. As a result, a stochastic search method such as the simulated annealing (SA) algorithm [7] is employed. Here, the SA algorithm generates a random sequence of change images in which a new configuration depends only on the previous change image and observed images by using the Gibbs sampling procedure [58]. The randomness of a new change image gradually decreases as the number of iterations increases. Eventually, this sequence of change images converges to the solution of the MAP detector. This paper is organized as follows. In the next section, the problem is formulated and the above assumptions are restated in a more formal manner. The optimum image change detection algorithm under the MAP criterion is developed in Section 3. Section 4 provides some examples for performance evaluation and illustration. A summary is presented in Section 5. II. PROBLEM STATEMENT Let S be a set of sites s, and be the phase space. Furthermore, we write as a sequence of image (configuration) vectors taken at time ; where . Note that, in the problem of interest here, N = 2. We assume that X_{i}(S) satisfies MRF properties with a Gibbs potential V_{C}(x_{i}). Note that X_{i}(S) are noiseless image models (NIM) of the scene. These will be used extensively in our discussion. Due to noise, we cannot observe X_{i} directly. Instead, we observe noisy images (NI) which are given by (2) where is the phase space corresponding to noisy observed images and W_{i}(S) is a vector of additive Gaussian noises with mean zero and covariance matrix . Here, I is an identity matrix of size , where . Obviously, there are a total of possible change images (CIs) that may occur between any pair of NIMs (including the nochange event). Let , correspond to the k^{th} CI which is a binary image whose intensity levels are either 0 or 1. Here, we use the notation to indicate a change at site a in the k^{th} CI. When , we have to indicate no change. We assume that all CIs satisfy MRF properties with the Gibbs potential U_{C}(H_{k}). Then, we can write the marginal PDFs for X_{i} and H_{k} as , (3) and , (4) where and , respectively. Since the elements of W_{i}(S) are i.i.d. additive Gaussian noises, the conditional probability density of Y_{i}(S) given X_{i}(S) is given by (5) Given H_{k}, we can partition a set S into two subsets and such that and , where andcontain all changed and unchanged sites, respectively. We further assume that the configurations of changed sites between a pair of NIMs are independent given the configurations of unchanged sites, i.e., (6) where and are the configurations of changed sites and unchanged sites of X_{i}, respectively. Here, we have made the simplifying assumption that the pixel intensities of the changed sites in a pair of images are statistically independent given the intensities of the unchanged sites to keep the analysis tractable. A justification for this assumption is that changes are not predictable based on the current knowledge. Note that for unchanged sites, configurations of the same site in a pair of images must be equal. This equal configuration value is denoted by . For notational convenience, let us denote and by and , respectively. The joint probability density function (PDF) of the pair is given by (7) Since X_{i} and X_{j} correspond to the same scene, we assume that they are statistically identical. By summing over all possible configurations of changed sites, the joint PDF can be expressed as (8) Substituting the Gibbs potential V_{C} and using the properties of conditional probability, (8) can be written as (9) By using the notation to denote the sum over the sets C that contain site s, the term can be decomposed as (10) where the boundaries of the changed region are denoted by contained within the changed region. Substituting (10) into (9), we obtain , (11) where , (12) and . (13) are the normalizing constant and the joint image energy function associated with H_{k}, respectively. Based on these assumptions, we design the optimum detector in the next section. Here, we have considered the case of discretevalued MRFs, the derivation in the case of continuousvalued fields (used in examples) is analogous. III. OPTIMUM CHANGE DETECTION ALGORITHM We formulate the change detection problem as an Mary hypothesis testing problem where each hypothesis corresponds to a different change image or a no change image. The maximum a posteriori (MAP) criterion [1213] is used for detecting changed sites in our work. This criterion is expressed as . (14) From Bayes’ rule, (14) can be rewritten as . Since is independent of H_{k} and the two noises are independent of each other, the above equation reduces to (15) where and denote , and , respectively. Substituting (4), (5) and (11) into (15) and taking a constant term out, we obtain , (16) where , (17) and . (18) From (16), we can observe that maximization of (15) is equivalent to minimization of . Therefore, our optimal ICD algorithm can be expressed as . (19) In practice, the solution of (19) cannot be obtained directly due to the large number of possibilities of change images (e.g., 16 for a 22 image and 2^{4096} for a 6464 image). Moreover, gradient search methods are not feasible because of the extreme nonconvexity of the a posteriori probability in general. To cope with these problems, we employ the simulated annealing (SA) [57] algorithm which is a part of Monte Carlo procedures to search for solutions of (19). The SA algorithm generates a sequence of configurations (or CIs) that eventually converges to solutions of (19). This algorithm can be described in terms of the following steps:
2) At each step, obtain a new configuration (image) from the previous configuration (image) based on a Gibbs sampling procedure [5]. 3) Reduce the temperature with a predetermined schedule and go to Step 2 until convergence. Here, the term temperature is used to control the randomness of the configuration generator. High temperature indicates high randomness and low temperature indicates less randomness. Eventually, the resulting change image will converge to one of the possible solutions of the MAP detection problem given in (19) with equal probability, as randomness is slowly removed. However, if randomness is removed too quickly the resulting change image may get stuck in one of the local maxima of the a posteriori probability. To guarantee convergence, Geman and Geman [7] suggested that the temperature must decrease at a rate slower than where M is the number of sites and is the absolute maximum change in Gibbs energy when the configuration of a single change site is changed. In the next section, some examples are provided for illustration. Both a simulated dataset and an actual dataset are used. The pseudocode for the simulated annealing algorithm is provided in the Appendix. IV. EXPERIMENTAL RESULTS In this section, we consider the specific problem of image change detection based on the MAP detector given in (19). Consider Figure 1 where we show the 8neighborhood of an image pixel and ten possible cliques. In the illustrative examples considered here, we consider only five clique types, C_{1}, C_{2, }C_{3}, C_{4 },C_{5} associated with the singleton, vertical pairs, horizontal pairs, leftdiagonal pairs, and rightdiagonal pairs, respectively. Furthermore, we assume that image potential functions are translation invariant. The Gibbs energy function of images is assumed to be given by , (20) where is the NIM parameter vector, and J is assumed to be which is the NIM potential vector associated with clique types, C_{1},…,C_{5}, respectively. We observe that the NIM potential vector is in a quadratic form. The quadratic assumption is widely used to model images in numerous problems [12], and is called the Gaussian MRF (GMRF) model. GMRF models are suitable for describing smooth images with a large number of intensity levels. Furthermore, under this Gaussian MRF model, Equation (19) can be solved more easily due to the fact that the summation over configuration space in (17) changes to infinite integration in product space. Hence, the analytical solution for (19) can be derived. Using the GMRF model [1415], we can rewrite (20) as , (21) where the element of the MM matrix is given by . (22) Similarly, we define a Gibbs energy function of a CI over the clique system as , (23) where is the CI parameter vector, and (24) is a CI potential vector associated with clique types as mentioned above and I(a,b) = 1 if a = b, and I(a,b) = 1, otherwise. The optimum detector has been derived in [18]. As mentioned in the previous section, the direct search for the solution of (19) is not feasible. Therefore, the simulated annealing algorithm is used to find the solution. Here, the Gibbs energy function is calculated based on the observed images and the old/new change image. In order to obtain the result in a reasonable time, NIMs can be divided into subimages of much smaller size (77 in our examples) due to the intensive computation required for the inversion of large matrices (4096 by 4096 for a 64 64 image). The suboptimum approach of considering one subimage at a time sacrifices optimality for computational efficiency. However, the statistical correlation among sites is concentrated in regions that are only few sites apart. Therefore, our approximation is reasonable and yields satisfactory results in the examples. Our optimal algorithm involves matrix inversion and multiplication, both of which have the computational complexity for an image of size n n. During each iteration, L_{k} given in (24) is computed at least 2n^{2} times. As a result, the total complexity of our optimal algorithm is . However, when we employ the suboptimum algorithm, the number of operations for each subimage is fixed. Consequently, the overall complexity reduces to . Example 1: Two noiseless simulated images of size 128 128 to be considered are displayed in Figures 2 (a)(b), and their corresponding CI is shown in Figure 3 where dark and bright regions denote change and no change, respectively. We observe that changes occur in the square region from pixel (60,60) to (120,120). To test our proposed ICD algorithm, these simulated images are disturbed with an additive Gaussian noise with zero mean and unit variance. An SA algorithm, with initial temperature T_{0} = 2, is employed for optimization. For the difference image, the average image intensity power to noise power is 3.4 dB. For both noiseless images, the average signal power is about 3.9 dB. Next, our proposed ICD algorithm is employed, and the results are shown in Figures 4 (a)(d) after 0, 28, 70 and 140 sweeps, respectively. At 0sweep, an image differencing technique is used, and the resulting CI is extremely poor. The situation improves as more sweeps are done. Significant improvement can be seen when we compare Figures 4 (a) and (b), and Figures 4 (b) and (c). However, very little improvement is seen in Figure 4 (d) when we compare it with Figure 4 (c). In order for the performance to be quantified, we introduce three performance measures, average detection, false alarm and error rates. The average detection rate is defined as the total number of changed sites that are detected divided by the total number of changed sites while the average false alarm rate is defined as the total number of unchanged sites that are declared changed sites divided by the total number of unchanged sites. Similarly, the average error rate is given by the total number of misclassified sites (changed sites are declared unchanged and vice versa.) divided by the total number of sites in the image. We plot the average detection, false alarm and error rates in Figure 5 for a quantitative comparison. We observe rapid improvement from 0 to 60 sweeps, and roughly constant performance after around 80 sweeps. The detection rate increases from about 60 percent to more than 90 percent and the average error rate decreases from around 35 percent to less than 5 percent. For further performance evaluation, we apply the Bruzzone and Prieto algorithm (BPA) in [9] to the same image dataset. In their algorithm, an initial change image is determined by applying two thresholds (low and high) to the difference image and they do not use all available information since only the different image is used. A pixel is labeled as change if difference in intensity levels between two corresponding pixels from different observed images exceeds the high threshold and as no change if it falls below the low threshold. The high and low thresholds are around the middle value, and the ratio of distance from the low and high thresholds to middle value is denoted by . For faster computation, instead of dealing with CI parameters in a vector format as we did, the authors used a simple scalar parameter denoted by . In this example, we choose = 0.75 (based on some experimentation) and = 1 and the resulting change image after five hundred iterations is shown in Figure 6 (a) while Figure 6 (b) displays the change image from our algorithm for comparison purposes. Clearly, our image change detection algorithm significantly outperforms the BPA. This is because their algorithm depends largely on an initial change image. In this particular case, the SNRs of the observed images and the difference image are very low causing a large number of misclassified pixels in the initial change image that are spread over both changed and unchanged regions which severely affects the performance of their algorithm. Our algorithm, on the other hand, depends mainly on the observed images rather than on an initial change image which makes it more robust in a low SNR environment. Robustness of our algorithm will be discussed shortly. The number of floating point operations for each iteration associated with the BPA in [9] is 5.1 10^{5} when using Matlab while for our algorithm, it is 3.1 10^{10}. Furthermore, BPA has the complexity O(n^{2}) because the number of instructions per iteration is independent of the image size. At this point our algorithm consumes more computational resources to achieve performance gain. Our algorithm is more suitable for offline applications than realtime ones. Furthermore, we investigate the robustness of our ICD algorithm with respect to image noise, image misregistration, and modeling errors. To consider the effect of image noise, four different experiments are carried out. Four additive Gaussian noise vectors with different values of variances, 0.01, 0.1, 1, and 10, are added to images shown in Figures 2 (a) and (b), and are submitted to our ICD algorithm. The resulting change images are shown in Figures 7 (a)(d), respectively. The error rates for each value of noise variance are calculated and plotted in Figure 8. As expected, higher value of noise variance yields higher error rate. In addition, we notice that for a large value of noise variance, our algorithm declares the entire image as no change to minimize the probability of error because noise overwhelms the information in NIMs. In other words, the dependence of intensity levels between two highly noisy images becomes insignificant. Next, we simulate the effect of misregistration on our ICD algorithm. In [10,17], the effect of misregistration has been previously investigated. Here, we study its effect on the performance of our algorithm. Here, we fix one image, Figure 2(a) and move the other image, Figure 2(b), to the right (x direction) by 0, 0.25, 0.5 and 0.75 pixels and determine the corresponding CIs for each value of misregistration. Since we translate the image by a noninteger number of pixels, interpolation is required. A linear interpolation method is used to estimate the intensity of each pixel. The results are shown in Figures 9 (a)(d) for noise variance equal to unity, and error rate as a function of misregistration is plotted in Figure 10. The effect of misregistration is rather small for misregistration under 0.5 pixel. However, larger misregistration affects the performance significantly. Lastly, we examine the effect of modeling errors on our algorithm. We simulate modeling errors by changing the PDF of noises from Gaussian to others. Here, we corrupt two NIMs with three additive noises with identical power but different PDFs which are Gaussian, speckle, and uniform, respectively. The multiplicative noise is generated using MATLAB as follows I = I + nI, (25) where I is the noiseless image data, and n is a uniform random variable. The resulting change images are shown in Figures 11 (a)(c), and the corresponding error rates are 0.061, 0.142 and 0.077, for Gaussian, speckle and uniform noises, respectively. As expected, the best performance is achieved in the Gaussian case where there is no modeling error. Example 2 In this example, we apply our ICD algorithm to two actual remotesensing images of San Francisco Bay taken on April 6^{th}, 1983 and April 11^{th} 1988 shown in Figures 12 (a)(b), respectively. These images represent the false color composites made from MSS band 4, 5 and 6, and are given at http://sfbay.wr.usgs.gov/access/change_detect/ Satellite_Images 1.html. For simplicity, our ICD algorithm will determine change sites only in images of the same bands. In other words, we will separately find CIs for red, green, and blue color spectra, respectively. Since we do not have any prior knowledge about the Gibbs potential and the Gibbs parameters associated with the Gibbs potential for both NIMs and CI, assumptions must be made and a parameter estimation method must be employed. Since the intensity levels of NIMs can be anywhere in [0 255], we assume that the intensity level of NIMs follows Gaussian MRF as in [1415], i.e., the Gibbs potential is in a quadratic form. Furthermore, we choose the maximum pseudo likelihood estimation (MPLE) method described in [16] to estimate the Gibbs parameters. Here, we estimate unknown parameters after every ten complete updates of CI. The results of changed sites for individual spectra are displayed in Figures 13 (a)(b), (c)(d), and (e)(f), respectively. Figures 13 (a), (c) and (e) are determined by our proposed ICD algorithm while (b), (d) and (f) result from the image differencing algorithm described in [1]. By carefully comparing results within each color spectrum, we conclude that our ICD algorithm gives results which are more connected. Further results on performance evaluation with this real dataset and fusion of change images based on the decision fusion methodology described in [12] are available in [18] V. SUMMARY This paper investigated the issue of image change detection based on Markov random field (MRF) models. These models characterize the statistical correlation of intensity levels among neighboring pixels more accurately than pixelbased models We have developed a new image change detection algorithm based on an MRF model that employs the MAP criterion. The algorithm involves the search for an optimum for which the simulated annealing algorithm is used. By means of two examples, we have shown the superior performance of our algorithm. This is due to the use of contextual information as well as the computation of the true MAP solution. The effect of uncertainties on the performance of our algorithm was also investigated. Noise, misregistration and modeling errors were considered to be the sources of uncertainty. Our algorithm was found to be quite robust to various types of uncertainties. We realize that the independence assumption of the configurations of changed sites given the configurations of unchanged sites may not be suitable for all cases. A more accurate model may result in better performance, and may be considered in the future. 