ebook img

Adaptive Langevin Sampler for Separation of t-Distribution Modelled Astrophysical Maps PDF

1.3 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Adaptive Langevin Sampler for Separation of t-Distribution Modelled Astrophysical Maps

IEEETRANSACTIONONIMAGEPROCESSING,VOL.?,NO.?,???????201? 1 Adaptive Langevin Sampler for Separation of t-distribution Modelled Astrophysical Maps Koray Kayabol, Member, IEEE, Ercan E. Kuruog˘lu, Senior Member, IEEE, Jose´ Luis Sanz, Bu¨lent Sankur, Senior Member, IEEE, Emanuele Salerno and Diego Herranz 1 Abstract—We propose to model the image differentials of The first examples of use of t-distribution in inverse imaging 1 astrophysical source maps by Student’s t-distribution and to problems can be found in [5] and [6]. In [6], it is shown that 0 use them in the Bayesian source separation method as priors. the t-distribution approximates the distribution of the wavelet 2 We introducean efficient Markov Chain Monte Carlo (MCMC) n samplingschemetounmixtheastrophysicalsourcesanddescribe coefficients of an image more accurately. In recent papers, a the derivation details. In this scheme, we use the Langevin it has been used for image restoration [7] and deconvolution J stochastic equation for transitions, which enables parallel draw- [8]. Notice that the degree of freedom parameter of the t- ing of random samples from the posterior, and reduces the distribution has the same role as the regularization parameter 7 computation time significantly (by two orders of magnitude). of the Markov Random Field (MRF) models. The MRF prior In addition, Student’s t-distribution parameters are updated ] with Cauchy density, which was first proposed in [10] for M throughout the iterations. The results on astrophysical source separation areassessed withtwoperformancecriteriadefinedin inverse imaging, was used in [1] for source separation. The I the pixel and the frequency domains. model used in [1] is an approximation to the t-distribution, . h which is presented in Section V-A. Index Terms—Bayesian source separation, Multi-channel p denoising, Metropolis-Hastings, Langevin stochastic equation, The t-distribution has already been used in Bayesian audio - o MCMC, Astrophysical images, Student’s t-distribution. source separation [9] to model the discrete cosine transform r coefficients of the audio signals. It was reported that the t s t-distribution prior had improved the sound quality over a I. INTRODUCTION the finite mixture-of-Gaussians prior. In this study, to solve [ THE Bayesian framework,which enables the inclusion of the Bayesian BSS problem for images without incurring in 1 prior knowledge in problem formulation, has recently smoothingartifacts,weproposethet-distributionformodeling v beenutilizedtoimprovetheperformanceofBlindSourceSep- the local pixel differences. 6 aration (BSS) techniques. In the context of image separation, We use the joint posterior density of the complete vari- 9 3 one obvious type of prior information is the spatial [1] or able set to obtain a joint estimate of all the variables. In 1 spatio-chromatic [2] dependence among the source pixels. this Bayesian approach, the BSS problem can be solved by . While there are three conditions that ensure separability maximizing the joint posterior density of the sources, the 1 0 of sources, namely, non-Gaussianity, non-whiteness and non- mixing matrix and the source prior model parameters [12], 1 stationarity [3], we choose to exploit spatial correlation (i.e. [13].Amethodforsolvingthejointposteriormodalestimation 1 spatial non-whiteness). The prior densities are constituted by problem is the Iterated Conditional Mode (ICM) method, v: modeling the image differentials in different directions as which maximizes the conditional densities sequentially for i Multivariate Student’s t-distributions [4]. The t-distribution each variable [11]. If the mode of the conditional density X has some convenient properties for our model: If the degree cannot be found analytically, any deterministic optimization ar offreedomparameterofthedistributiongoestoinfinity,itap- method can be used [13]. However, under any non-Gaussian proachesanormaldensity;conversely,ifthedegreeoffreedom hypothesis, ICM does not guarantee a unique global solution. parameter equals 1, the density becomes Cauchy. Therefore Another algorithm suitable for learning the Gaussian MRF the t-distribution is a flexible and tractable statistical model is the Expectation-Maximization (EM) method. Using the for data ranging from broad-tailed to normally distributed. Mean Field Approximation (MFA), the expectation step of the EM algorithm can be calculated analytically [14]. The Manuscript received September 29, 2009; revised March 15, 2010 and MFAunderGaussianmodelassumptioncausessmoothingthe accepted March 28, 2010. Koray Kayabol undertook this work with the support of the ”ICTP Programme for Training and Research in Italian edges in the image. The reason underlying these smoothing Laboratories, Trieste, Italy, through a specific operational agreement with effects is that the Gaussian approximation violates the edge CNR-ISTI, Italy. Partial support has also been given by the Italian Space preserving property of the prior density, and the effect is Agency(ASI),underprojectCOFIS(CosmologyandFundamental Physics). Theprojectispartially supported byCNR-CSICbilateral project. proportional to the amount of noise. The image model with K. Kayabol, E. E. Kuruog˘lu and E. Salerno are with the ISTI, CNR, spatially varying variance parameter in variational Bayesian via G. Moruzzi 1, 56124, Pisa, Italy, (e-mail: [email protected]; approximationcanhelp overcomethe smoothingproblem[7], [email protected]; [email protected]). J. L. Sanz and D. Herranz are with the IFCA, University of Cantabria, [8]. In [15], [14], deterministic optimization techniques have Avda.LosCastross/n39005,Santander,Spain,(e-mail:[email protected]; been used for the MRF. In [1], a Gibbs sampling stochastic [email protected]). optimizationprocedureisused.Sinceitisnotpossibletodraw B. Sankur is with the Bogazici University, Electrical & Electronics Eng. Dept.,34342,Bebek, Istanbul, Turkey,(e-mail: [email protected]). samples in a simple way due to the MRF priors in Gibbs 2 IEEETRANSACTIONONIMAGEPROCESSING,VOL.?,NO.?,???????201? distributionform,aMetropolisembeddedGibbssamplinghas First, we adoptthe t-distributionto build a priormodelof the beenadopted.In[1],itisreportedthattheMonteCarloresults source maps. More specifically, the t-distribution is used to withlessimagesmoothing,butthecostofavoidingsmoothing modelthedifferentialsofthesources,asdonebythefirst-order artifacts is a significant increase in the convergence time. homogeneousMRF models.Thisis advantageousbecausethe In this study, we propose a more efficient Monte Carlo flexibility of the t-distribution allows each source differential Markov Chain (MCMC) sampling method in lieu of random to assume a different model whether impulsive or Gaussian, walk Metropolis. To produce proposal samples in parallel, simply by setting the dof parameter. The second contribution we resort to the Langevin stochastic equation [16], [17], is the introduction of the Langevin sampling scheme in lieu [18], while the proposed samples are accepted or rejected by of the random walk Gibbs sampling. As opposed to pixel- the Metropolis scheme. In statistical physics, the Langevin by-pixel sampling, Langevin sampling generates the samples equation [16] is used to describe the Brownian motion of in parallel, thus leading to much faster convergence. Further- particles in a potential field and has been used to obtain a more, the samples are drawn in an informed way, since their smart MC algorithm in [17]. Another parallel sampling algo- generationfollowsthegradientdescentdirectiononanenergy rithm is the Hamilton Monte Carlo, which is the generalized surface. versionoftheLangevinsampler[18].Weconjecturethat,with Section II is a brief introduction to astrophysical sources. the samples produced in parallel by the Langevin equation, In Section III, the component separation problem in observa- the convergence time of the algorithm will be significantly tional astrophysicsis stated. Section IV lays out the Bayesian reduced. formulationof the problem.Section V presents the derivation Parameter estimation in Bayesian edge preserving inverse steps of the adaptive Langevin sampler algorithm along with imaging problems with Gibbs distributions is a troublesome the EM parameter estimation method. The simulation results process because of the partition function. Although there are presented in Section VI and interpreted in Section VII. are some methods [19], [20] that calculate the parameters using Monte Carlo techniques, their computational burdens II. AN INTRODUCTION TOASTROPHYSICAL SOURCES are prohibitive. One can resort to the Pseudo Likelihood (PL) approximation to make the partition function separable, Here, we only give a brief description of the astrophysical which is more convenient for parameter estimation with the radiations considered, referring the interested reader to [23] Maximum Likelihood (ML) method. In [21], two Bayesian for details. We are interested in the frequency range 30 to approaches have been used to estimate parameters, namely, 1000GHzwherethedominantdiffuseradiationsaretheCMB, the Maximum-a-Posteriori (MAP) and evidence approaches. the galactic synchrotron radiation and the thermal emission An approach to estimate the regularization parameter from from galactic dust. Studying this radiation would help us to the PL approximation has been recently proposed [22]. The understandthedistributionandthefeaturesofinterstellardust multivariatet-distributionis also a PL approximationto MRF in our galaxy. and has advantages over MRF in parameter calculations. The most interesting astrophysicalsource in the microwave There are two types of parameters in edge preserving region of the electromagnetic spectrum is the CMB, a relic inverse imaging. The first one is the adaptive edge preserving radiation originating from the time when the Universe was parameter, which is also known as threshold parameter. We 300.000 years old. The discovery of the CMB is one of the interpret the threshold parameter as the scale parameter of fundamental milestones of modern cosmology and its study the t-distribution. The other parameter is the regularization allows us to determine fundamental parameters such as the parameter, which adjusts the balance between the likelihood age of the universe, its matter and energy composition, its and the prior.The regularizationparametercorrespondsto the geometry and many other relevant cosmological parameters. degree of freedom (dof) parameter of the t-distribution. To CMB should be a blackbody radiation at a temperature of estimatethescaleanddofparametersofourt-distribution,we 2.726 K, thus its emission spectrum should be perfectly use ML estimation via EM algorithm as in [30]. A similar known. The CMB emission dominates over the other sources approach has been used in [7] and [8]. at frequenciesaround 100 GHz. The CMB temperature is not Inacomparativestudyamongimagesourceseparationalgo- perfectly anisotropic. Standard cosmological models predict rithms [1], we have found that the Bayesian formulation with that the CMB anisotropy is Gaussian distributed, although MRFpriorandGibbssamplingoutperformedtheheuristicand some alternative models permit a certain degree of non- theotherBayesianapproaches.Thisworkisbasedon[1],and Gaussianity.Currentobservationsare compatiblewith the hy- aimstoachieveamuchfasterMCMCimplementationwithout pothesis of Gaussianity. Two all-sky surveys have been made compromising its good performance. With this goal in mind, on CMB so far,by NASA’s satellites COBE [32] and WMAP we have been testing our algorithms on a current problem of [33]. A European mission whose data will be highly accurate modern astrophysics: the separation of radiation source maps andspatiallyresolved,Planck[23],isabouttoprovideitsfirst from multichannel images of the sky at microwave frequen- full-sky coverage maps. cies. In particular, we have been applying our algorithm to The CMB signal is mixed with other astrophysical sources the separation of the Cosmic Microwave Background (CMB) ofelectromagneticradiation.Relativisticelectronsbeingaccel- radiationfrom the galactic emission (synchrotronand thermal eratedbymagneticfieldsintheGalaxygiverisetosynchrotron dust emissions) using realistically simulated sky maps. emission, which dominates over the CMB in regions close Theoriginalcontributionsofthepaperhingeontwoaspects. to the Galactic plane especially at low frequencies (< 200 KAYABOLetal.:ADAPTIVELANGEVINSAMPLERFORSEPARATIONOFT-DISTRIBUTIONMODELLEDASTROPHYSICALMAPS 3 TABLEI ROOTMEANSQUAREERROR(RMSE)OFFITTINGOFTHEIMAGE DIFFERENTIALHISTOGRAMSOFCMB,SYNCHROTRONANDDUST 500 HCiasutocghryan 500 HCiasutocghryam COMPONENTS.THELASTCOLUMNSHOWSTHEESTIMATEDDOF 440500 tG−aduisstsriibauntion 440500 tG−aduisstsriibauntion PARAMETERSOFTHEt-DISTRIBUTION. 350 350 300 300 Horizontal direction 250 250 200 200 Gaussian Cauchy t-distribution dof 150 150 CMB 15.11 30.84 15.47 25.71 100 100 50 50 Synchrotron 23.70 30.06 15.70 3.59 0 −0.1 −0.05 0 0.05 0.1 0 −0.1 −0.05 0 0.05 0.1 Dust 67.99 33.21 13.84 1.81 Vertical direction (a) CMBhorizontal (b) CMBvertical Gaussian Cauchy t-distribution dof CMB 15.75 29.70 15.83 15.75 Synchrotron 19.11 31.00 18.39 19.11 Dust 63.73 68.19 54.42 2.18 890000 HCtG−iaadsuuitsocstghsriryibaaunmtion 455000 HCtG−iaadsuuitsocstghsriryibaaunmtion 700 400 600 350 500 300 sky patch, located at 0◦ galactic longitude and 40◦ galactic 400 220500 latitude, discretized into a 512 512-pixel map. Within this 300 150 × 200 100 patch, we have introduced simulated CMB, synchrotron, and 100 50 dustradiationmaps(asin[26]).Wehavecomputedthesource 0 0 −8 −6 −4 −2 0 2 4 x6 10−3 −6 −4 −2 0 2 4 6x 10−4 image differentials for horizontal and vertical directions, and (c) Synchrotron horizontal (d) Synchrotronvertical estimated their empirical distributions. We have fitted three different functions to the empirical distributions of the image differentials of the astrophysical sources with nonlinear least Histogram 4500 Histogram square method using the Curve Fitting Toolbox of MATLAB. 4000 Cauchy Cauchy t−distribution 4000 t−distribution 3500 Gaussian 3500 Gaussian These functions are proportional to Gaussian, Cauchy and t- 3000 3000 distribution. Fig. 1 shows the fitting results for CMB, syn- 2500 2500 2000 2000 chrotronanddustimages.TableIliststheresidualRootMean 1500 1500 SquareErrors(RMSE)ofthefits. TheGaussiangivesthebest 1000 1000 500 500 fit for CMB because the CMB is theoretically distributed as 0 0 −0. 01−0.008−0.006−0.004−0.002 0 0.0020.0040.0060.0080.01 −15 −10 −5 0 5 a Gaussian [24]. Overall, the t-distribution appears to be a x 10−3 (e) Dusthorizontal (f) Dustvertical good choice for modeling the image differential statistics in horizontal and vertical directions of all the components. The Fig. 1. Fitting plots of the image differential histograms (dots) of CMB, estimated dof parameters of t-distributions show that indeed synchrotron and dust components in horizontal and vertical directions. The fitted functions proportional to Gaussian (dot line), Cauchy (dash-dot line) the proposed model assumes from impulsive to Gaussian andt-distribution (dashline). characteristic underlying each component. If the component is Gaussian as CMB, the dof parameter becomes bigger and if it is impulsive, the dof parameter becomes very small. GHz). According to observations in other frequency bands, the synchrotron emission spectrum follows a power law with a negative exponent, whose value is presently known with III. COMPONENT SEPARATION PROBLEM IN high uncertainty.Synchrotronis the dominantradiationin the OBSERVATIONALASTROPHYSICS low-frequencybandsofourrangeofinterest.Inter-stellardust Virtually any application in observational astrophysics has grainsareheatedbynearbystarsandre-emitthermalradiation to do with problems of component separation. Indeed, all in the far infrared region of the electromagnetic spectrum. the astrophysical observations result from the superposition Dust radiation is dominant in the high end of our range. of the radiation sources placed along the line of sight. While In particular, it is almost the only significant contribution to very distant sources can be distinguished by the redshift the total diffuse radiation between 800 GHz and 1000 GHz. analysis, for nearby sources this is not possible. In fact, Its emission spectrum should follow a greybody law, with physically distinct sources can sometimes be found within a unknown spectral index and an additional degree of freedom close range of each other. Furthermore, high sensitivity and given by the thermodynamicaltemperatureof the dust grains. high resolution measurements can give rise to source mixing For a short review on CMB astronomy, see [24]. problems even in the cases where the radiation under study There are other astrophysicalsources present at microwave is dominant over interfering radiations. Apart from redshift frequencies, such as free-free emission due to free electrons, analysis,usefulmethodsto distinguishbetweensuperimposed anomalous dust emission and radiation coming from extra- physically different radiations include spectral analysis and galactic sources, but their relevance is smaller. In this work morphologicalanalysis.Inthispaper,weonlytreattheformer we will focus on the main three astrophysicalsources present approach, exploiting the differences in the emission spectra in CMB experiments: CMB, synchrotron and dust. shown by physically distinct radiation sources. This implies In order to justify our adoption of the Student’s t- thattheseparationmustbedoneonthebasisofmeasurements distribution in a relevant case, we have selected a 15◦ 15◦ made at different frequency bands. × 4 IEEETRANSACTIONONIMAGEPROCESSING,VOL.?,NO.?,???????201? Weassumethattheobservedimages,y ,k 1,2,...,K , wherethe asterisk meansconvolution,and h is the telescope k k ∈{ } are linear combinations of L source images. Let the kth radiation pattern in the k’th observationchannel. Note that, if observed image be denoted by y , where i 1,2,...,N h is the same for all the channels, (4) can be written as k,i k ∈ { } represents the lexicographically ordered pixel index. The im- L L age separation problem consists in finding L independent y˜ = a h s +n = a ˜s +n (5) k k,l l k k,l l k sources from K different observations. If s and y denote ∗ l k l=1 l=1 N 1vectorrepresentationsofsourceandobservationimages, X X × and the problem is again instantaneous for the modified respectively, then the observation model can be written as sources ˜s , which are the physical sources smoothed by the l L common radiation pattern h. Unfortunately, especially in the yk = ak,lsl+nk, k =1,...,K (1) radio- to millimeter-wave ranges, the telescope aperture de- l=1 pendsstronglyonfrequency,andmodel(5)cannotbeadopted X where n is an iid zero-mean noise vector with Σ = σ2I directly,unlesstheobservedsignalsarepreprocessedtoreduce k k N covariance matrix and I is an identity matrix. Although the their angular resolution to the worst available (see [26]). N noise is not necessarily homogeneous in the astrophysical Since no imagingsystem can achievean infinite resolution, maps, in this study we assume that the noise variance is we shoulduse (4) as ourgenerativemodel.However,to avoid homogeneouswithin each sky patch and is also known. problemswith the convolutivemixtures, we assume to have a Since the observation noise is assumed to be independent telescope with the same radiation pattern in all the channels, and identically distributed zero-mean Gaussian at each pixel, soastobeabletousemodel(5).Notethatthiscanalwaysbe the likelihood is expressed as obtainedbypreprocessing,providedthatallthebeampatterns are known. Hereafter, as it will not cause any ambiguity, we K dropthetildeaccentfromthesymbolsusedtodenotethedata p(y s ,A) exp W(s y ,A,σ2) (2) 1:K| 1:L ∝ − 1:L| k k and the source vectors. k=1 Y (cid:8) (cid:9) (y L a s ) 2 W(s y ,A,σ2) = || k− l=1 k,l l || (3) IV. SOURCE SEPARATION DEFINEDIN THE BAYESIAN 1:L| k k 2σ2 P k FRAMEWORK wherethemixingmatrixAcontainsallthemixingcoefficients A. Source Model a introduced in (1). k,l Neighborpixelsinourimageshavestrongdependency.This For many purposes, a mixing model of the type (1) is is demonstrated, for example, in Fig. 2(d) where the scatter- considered to fit reasonably well to an astrophysical obser- plotshowsthefirstorderrightneighborpixelsofthedustmap vation. The details on how to get an equation similar to (1) shown in Fig. 2(a). The dependency decreases in the high from the physics of the problem can be found in [25]. Here, intensity region. This region in the scatter-plot corresponds we only summarize the main assumptions made with this to spatially localized structures with high image intensity of purpose.First,we assumethatthesuperpositionofthesignals the map in Fig. 2(a). The existence of a small number of originating from different sources is linear and instantaneous. point-likestructuresinthemapsindicatesthatthedependency In the astrophysical case, this assumption is clear, since the assumption is valid. In view of this, we can write an auto- physical quantities to be measured are superpositionsof elec- regressive source model using the first order neighborsof the tromagnetic waves coming, for any bearing, exactly from the pixel: same line of sight without any scattering or diffraction effect. s =α G s +t (6) Thesecondassumptioninmodelingastrophysicalobservations l l,d d l l,d is that each source has an emission spectrum that does not where d 1,...,D denotes one of the main directions ∈ { } varywith thebearing.Thisassumptionimpliesthatindividual (left, right, up and down) and D = 4 is the cardinality of radiations result from the product of a fixed spatial template the set of image differential directions. Matrix G is a linear d and an isotropic emission spectrum. Both assumptions need one-pixel shift operator in direction d, α is the regression l,d closer attention. The precise emission spectrum generated by coefficient and the regression error t is an iid t-distributed l,d any physical phenomenon depends on many quantities that zero-meanvectorwithdofparameterβ andscaleparameters l,d may not all be distributed uniformly in the sky. Although in δ , (t 0,δ I ,β ). We can justify the iid assumption l,d l,d l,d N l,d T | many applications the isotropy assumption has been adopted of t by plotting (see Fig. 2(e)) the values in t versus l,d l,d successfully, in many other cases the space-variability of the its first order neighbors, G t . We can interpret t as a d l,d l,d radiation sources must be taken into account to allow a good decorrelated version of s . Fig. 2(b) shows t for d=1. By l l,d separation to be performed. Furthermore, if the effect of the comparing Fig. 2(d) and (e), we can say that t is spatially l,d telescope is taken into account, then the instantaneous model more independent than s . l isnomorevalidsince,forthefiniteaperture,thelightcaptured If the image s were Gaussian distributed, then the regres- l in a fixed direction in the telescope does not come from that sion error would also be Gaussian. However in real images, direction alone. In formulas, model (1) becomes the regression error is better modelled by some heavy-tailed distribution. The t-distribution can conveniently model the L y˜ =h y =h a s +n (4) statistics of data whose distribution ranges from Cauchy to k k k k k,l l k ∗ ∗ Gaussian, and therefore it is a convenient model for the l=1 X KAYABOLetal.:ADAPTIVELANGEVINSAMPLERFORSEPARATIONOFT-DISTRIBUTIONMODELLEDASTROPHYSICALMAPS 5 a weak dependence between right and left difference images, (a) (b) (c) we maintain the independency assumption to constitute the productsoftheregressionerrorprobabilitiesmodel.A similar approach to constitute a single prior by multiplying different individualpriorscanbefoundin[27].Inthisstudy,weusethe left/right and up/down differences jointly in the prior model to balance their contributions. Otherwise, we could obtain (d) (e) (f) directionally biased results. 0.03 0.01 0.01 The t-distribution can be written in implicit form by using 0.02 Gs10.01 Gt11 0 t3 0 a Gaussiananda Gammadensity.Thet-distributionhasa dof parameter β, which is itself governed by Gamma distribution 0 −0.01 −0.01 0 0.010.02 −0.01 0 0.01 −0.01 0 0.01 withparameterβ/2.Thet-distributionhasthefollowingform s t t 1 1 [30]: (g) 0.01 p(t α ,β ,δ )= p(t ν ,δ )p(ν β )dν l,d l,d l,d l,d l,d l,d l,d l,d l,d l,d | | | t4 0 δZ I β β = t 0, l,d N ν l,d, l,d dν . (8) l,d l,d l,d −0.−001.01 0 0.01 Z N (cid:18) | νl,d (cid:19)G(cid:18) | 2 2 (cid:19) t 1 The interpretation of this equation is that if t δ ,ν is l,d l,d l,d | distributed with a normal density (t 0,δ I /ν ) and l,d l,d N l,d N | theparameterν hasaGammaprioras (ν β /2,β /2), l,d l,d l,d l,d Fig.2. (a):Thesynchrotron maps,(b):Right t1=s−G1sand(c):Up the distribution of t β ,δ becomesGa t-d|istribution such t3=s−G3sdifferencemapsof(a).(d):Scatter-plotofthefirstorderright l,d| l,d l,d neighbor pixels of (a), s and G1s. (e): Scatter-plot of the first order right that T(tl,d|0,δl,dIN,βl,d). The representation in (8) is a neighbor pixels of(b),t1 andG1t1.(f):Scatter-plot oftherightdifference particularcaseoftheGaussianScaleMixture(GSM)densities. (b)versusupdifference(c),t1andt3.(g):Scatter-plotoftherightdifference The ML estimation of the parametersα , β and δ using (b)versusleftdifference (c),t1 andt4. l,d l,d l,d the EM method [30] is given in Section V-B. statistics of the high spatial frequency contents of images, B. Posteriors such as regression errors [6]. The scale parameter δ can l,d The joint posterior density of all the unknowns in the BSS be assumed as a space-varyingparameterto model the highly problem can be written as: non-stationary sources, i.e. sparse sources, but homogenous variance assumption has been observed to be adequate for p(s ,A,Θy ) p(y s ,A)p(s ,A,Θ) (9) 1:L 1:K 1:K 1:L 1:L | ∝ | diffuse astrophysical source images for the image patch sizes where p(y s ,A) is the likelihood and p(s ,A,Θ) thatweuseinthesimulations.Weuseahomogeneousvariance 1:K| 1:L 1:L is the joint prior density of unknowns. The joint prior since, in our case, the increased complexity derived from can be factorized as p(s α ,β ,δ ) p(A) inhomogeneity is not justified by a significant improvement 1:L| 1:L,1:D 1:L,1:D 1:L,1:D p(β ) p(δ ) p(α ). Furthermore, since the in performance. 1:L,1:D 1:L,1:D 1:L,1:D sources are assumed to be independent, the joint probability The regression error t represents the directional image l,d density of the sources is also factorized as p(s Θ) = differential in the direction d. The multivariate probability L p(s Θ). 1:L| density functionof an image modelledby a t-distribution can l=1 l| Mathematically, we can assume uniform priors for α be defined as Q l,d ∈ ( 1,1), δ (0, ) and a (0, ), because a ’s l,d k,l k,l − ∈ ∞ ∈ ∞ p(t α ,β ,δ ) = Γ((N +βl,d)/2) are always positive. The practical usage of these priors is l,d| l,d l,d l,d Γ(β /2)(πβ δ )N/2 explainedinSectionV-D.WeuseaconjugateGammapriorfor l,d l,d l,d 1+ φd(sl,αl,d) −(N+βl,d)/(72) pβal,rdam∼etGer(s1/e2x,p2er×im1e0n−ta3l)l.y.WTehheavceonddeitteiormnailnepdosthteeriroersspeocftiavlel × β δ (cid:20) l,d l,d (cid:21) model parameters are written as where φ (s ,α )= t 2 = s α G s 2 and Γ(.) is the Gamdmalfunl,cdtion.|W| le,dc||an wr||itel−the dl,ednsidtyl|o|f s by using p(ak,l|y1:K,s1:L,A−ak,l,Θ) ∝ p(y1:K|s1:L,A) l p(α y ,s ,A,Θ ) p(t Θ) the image differentialsin differentdirections,assuming direc- l,d| 1:K 1:L −αl,d ∝ l,d| tionalindependence,as p(sl|Θ)= Dd=1p(tl,d|αl,d,βl,d,δl,d) p(βl,d|y1:K,s1:L,A,Θ−βl,d) ∝ p(tl,d|Θ)p(βl,d) (10) jwuhsteirfey tΘhis=ass{uαm1p:Lti,o1:nD,bβy1:pLl,o1t:tDin,gδQ1t:Lhe,1:Dho}r.izWonetalcarnegrseimsspiolyn p(δl,d|y1:K,s1:L,A,Θ−δl,d) ∝ p(tl,d|Θ)p(δl,d) p(s y ,s ,A,Θ) p(y s ,A)p(s Θ) error versus the vertical one as shown in Fig. 2(f). We can l| 1:K (1:L)−l ∝ 1:K| 1:L l| observe in this figure that the scatter plot of the regression where variable expressions in the subscripts denote the − errors in left and up directions is almost circular and this removalof thatvariablefromthe variableset. The parameters justifies our assumption of independence.Fig. 2(g) shows the α,β andδ havesizeL D,AhassizeK Landthesources × × scatter-plotofrightdifferenceversusleftdifference.Inspiteof havesizeL N.Overallthereare(3D+K+N)Lunknowns. × 6 IEEETRANSACTIONONIMAGEPROCESSING,VOL.?,NO.?,???????201? For parameters α , δ and β , we exploit the EM The energy function W(s A) was defined in (3). The l,d l,d l,d 1:L | method. To estimate the source images, we use a version of total energyfunctionis proportionalto the negativelogarithm theposteriorp(s .)augmentedbyauxiliaryvariablesandfind of the posterior. In summary, the three terms correspond, l | the estimation with a Langevinsampler. The details are given respectively, to the fit to data and to the inertial and the in Section V. kinetic energy terms. We can define the Lagrangian function: L(s (t),v (t)) = K(v ) W(s ) U(s ) and write the l l l 1:L l − − V. ESTIMATION OFSOURCES AND PARAMETERS Lagrange-Euler equation for the Lagrangian as follows In this section, we give the details of the estimation of the d ∂L(s(t),v (t)) ∂L(s (t),v (t)) l l = l l , (15) sources and the parameters. dt ∂v ∂s (cid:18) l (cid:19) l dv ∂ M l = E(s ). (16) A. Sources l dt −∂s l l We modify the posterior densities of the source images where E(s ) = W(s ) + U(s ). If we discretize the 1:L 1:L l p(s Θ) to obtain a more efficient MCMC sampler. In the l| dynamicsin(16)andvelocityvl(t)usingtheLeapfrogmethod classical MCMC schemes, a random walk process is used [18], we obtain the following three-step iteration to produce the proposal samples. Although random walk is simple,it affectsadverselythe convergencetime.The random vk+12 = vk 1τ M−12g(sk ) (17) walk process only uses the previous sample for producing a l l − 2 l l 1:L newproposal.Insteadofa randomwalk,we usethe Langevin sk+1 = sk+τ vk+12 (18) l l l l stochastic equation, which exploits the gradient information vk+1 = vk+12 1τM−12g(sk ) (19) of the energy function to produce a new proposal. Since the l l − 2 l l 1:L gradient directs the proposed samples towards the mode, the final sample set comes mostly from around the mode of the where g(sk1:L) = [∇slE(s1:L)]s1:L=sk1:L, ∇sl is the gradient with respectto s andτ is the discretetime step.If we define posterior [28], [29]. l l 1 −1 The Langevin equation can be obtained from the total a diagonal matrix Dl2 = τlMl 2, so that, for the nth pixel, energyfunction.We firstdefineeverythingin continuoustime the diffusion coefficient is Dl(n,n) = τl2/ml,n. Matrix D is referred to here as the diffusion matrix, and is derived in to givethe derivationsteps of the Langevinequation,then we Section V-A1. Instead of this step scheme, we use the one- transfer them into discrete time. To obtain the total energy step Langevin difference equation. To obtain the single step function,we introducea velocityparameterv (t)=ds (t)/dt l l Langevin update equation for s , we substitute (17) into (18). to define the kinetic energy such that l K(vl(t)|Ml)= 12vlT(t)Mlvl(t) (11) skl+1 =skl − 21Dlg(sk1:L)+Dl12Ml21vlk (20) where M is a diagonal matrix whose diagonal elements This form is also used in [28], [29]. The samples are correspond to mass parameters m for pixel index n = produced by using this first order equation, and then they are l,n 1,...,N. Using the velocity parameter v , the modi- tested in the Metropolis-Hastings scheme. l fied version of the posterior density in (10) is writ- If we assume the transitions in (20) as a Wiener process ten as p(sl,vl y1:K,s(1:L)−l,A,Θ,Ml) p(y1:K s1:L,A) and take into account the fact that the velocity vector vl is p(sl Θ)p(vl M|l). More explicitly, it can b∝e written a|s independent of the source vector sl, [18], then its probabil- | | ity density function can be set as a multivariate Gaussian p(sl,vl|y1:K,s(1:L)−l,A,Θ,M)∝ as pvl(vl) = (|Ml|/2π)12 exp −12vlT(t)Mlvl(t) . We can exp (W(s A)+U(s Θ)+K(v M)) (12) produce a random sample from this probability such that 1:L l l {− | | | } −1 (cid:8) (cid:9) v =M 2w where w is a zero-meanGaussian vector with where the energy function U(s Θ) of a source image can be l l l l l| identity covariance matrix (w 0,I). If we substitute this written in terms of image differentials tl,d as random sample into (20), wNe obtla|in the associated Langevin D equation U(sl|Θ)= ρ(tl,d|Θ). (13) sk+1 =sk 1Dg(sk )+D21w (21) d=1 l l − 2 l 1:L l l X where the function ρ(t Θ) is proportional to the negative Since the random variables for the image pixel intensities l,d | logarithm of the t-distribution in (7), that is, are producedin parallel by using (21), the procedure is faster than the random walk process adopted in [1]. The random N +β φ (s ,α ) ρ(t Θ)= l,d log 1+ d l l,d (14) walkprocessproduceslocalrandomincrementsindependently l,d | 2 β δ (cid:20) l,d l,d (cid:21) fromtheneighborpixelsandtheobservations.IntheLangevin and the function log[1+φ (s ,α )/β δ ] is the regular- sampler, the samples are generated in an interrelated manner d l l,d l,d l,d ization function proposed in [10]. The terms (N + β )/2 and in terms of the descent of an energyfunctionthat reflects l,d and β δ correspond to the regularization and the thresh- the goodness of the model fit. Once the candidate sample l,d l,d old parameters, respectively, used in edge preserving image image is produced by (21), the accept-reject rule is applied reconstruction. independently to each pixel. In the case of random walk, KAYABOLetal.:ADAPTIVELANGEVINSAMPLERFORSEPARATIONOFT-DISTRIBUTIONMODELLEDASTROPHYSICALMAPS 7 TABLEII p(s ,v y ,s ,A,Θ,M), we obtain the following METROPOLIS-HASTINGSALGORITHMFORASOURCEIMAGE.u:UNIFORM l l| 1:K (1:L)−l POSITIVERANDOMNUMBERINTHEUNITINTERVAL;z:GENERATED equation SAMPLEVECTORTOBETRGIEENDE;RϕA(TzEnD,sSklA,nM)P:LAEC.CEPTANCERATIOOFTHE E(s +∆s ) = E(s )+ E(s )T∆s + 1∆sTH(s )∆s h l l i h l ∇ l l 2 l l li 1) wl∼N(wl|0,I) (25) 2) H(skl)←−[diag{H(sl)}sl←−skl]−1 whereH(sl)istheHessianmatrixofE(sl)withrespecttosl. 3) Dl←−2[H(skl)]−1 Fromthisequation,theoptimuminfinitesimal∆sl isfoundas 45)) gpr(osdk1u:Lce)z←←−−[∇sskll−E(12s1D:Llg)](ss1k1:L:L=)s+k1:LDl21wl from(21). ∆sIlf=we−a[hlHso(stal)kie]−t1hhe∇eExp(escl)tai.tion of both sides of (20), we 6) forallpixeln=1,...,N obtain a) calculate ϕ(zn,skl,n) sk+1 = sk 1D g(sk ) (26) b) ifϕ(zn,skl,n)≥1thenskl,+n1=zn h l i h li− 2 l 1:L elseproduceu∼U(0,1). eiflsue<skl,+nϕ1(z=n,ssklkl,,nn)thenskl,+n1=zn, paDnelcdgta(cstiok1om:Lnp)oafr=itnhge−ih2ns[vhkleH+rs1(eisol)=fi]H−he1sshkls∇iiaEn+(ms∆la)stirl.ixwR,wiattheheu(r2se6th)i,atsnwdetihageworenixtae-l c) n+1←−nextpixel. calculated by the value of s at the discrete time k as l D g(sk )= 2[H(sk)]−1g(sk ) (27) we would produce the candidate sample pixel and apply the l 1:L − l 1:L accept-reject rule. The sampling of the whole image would where H(sk) = [diag H(s ) ]−1 and diag . operator be completed by scanning all the pixels in a sequential order extract thelmain diago{nal olf }thsel=Hsklessian matrix.{F}rom (27), as in Gibbs sampling. Since each pixel has to wait the we can find the diffusion parameter as [35]: update of the previous pixel, this procedure is very slow. In random walk, candidate pixels can be produced in parallel D =2[H(sk)]−1. (28) l l but, producing a candidate sample for the whole image using This approximation is justified if H(s ) is strongly diago- random walk is not a reasonable method because hitting the l rightcombinationforsuch a hugeamountof data(i.e. 105) nally dominant. ≈ is almost impossible. By Langevin sampler, the likelihood of approximatelyhitting the rightcombinationat any one step is B. Parameters of t-distribution much higher. We can writethe jointposteriorof theparametersα , β After their production, the samples are tested via l,d l,d and δ such that p(α ,β ,δ t ,Θ ) = Metropolis-Hastings [34] scheme pixel-by-pixel. The accep- l,d l,d l,d l,d| l,d −{αl,d,βl,d,δl,d} p(t Θ)p(β )p(δ ). Using the likelihood p(t Θ) in (8) tance probability of any proposed sample is defined as l,d| l,d l,d l,d| min ϕ(sk+1,sk ),1 , where and the priors of the parameters, we can find the MAP esti- { l,n l,n } mates of the parameters of the t-distribution by EM method. q(sk sk+1) Instead of maximizing the log p(t Θ)p(β )p(δ ) , we ϕ(skl,+n1,skl,n)∝e−∆E(skl,+n1)q(skl,+n1| sl,kn ) (22) maximize the following function{ l,d| l,d l,d } l,n | l,n Ewh(sek1re:L,∆n)E(=skl,+nW1)(s=k1:L,nE)(s+kl,+nU1,(ssklk(,1n:L).)−Fl,onr) a−nyE(ssink1g:Lle,n)piaxneld, Z log(p(tlp,d(|νΘl,)dp|t(klβ,dl,,dΘ)pk()δl,d))p(νl,d|tkl,d,Θk)dνl,d (29) U(sl,n) can be derived from (13) and (14) as =hlog{p(tl,d|Θ)p(βl,d)p(δl,d)}iνl,d|tkl,d,Θk D 1+β φ (s ,α ) logp(ν tk ,Θk) U(sl,n)= l,d log 1+ d l,n l,d (23) − l,d| l,d νl,d|tkl,d,Θk 2 β δ d=1 (cid:20) l,d l,d (cid:21) (cid:10) (cid:11) X Theproposaldensityq(sk+1 sk )isobtained,from(21),as where p(ν tk ,Θk) is the posterior density of the hid- l,n | l,n l,d| l,d den variable ν conditioned on parameters estimated in the τ2 τ2 l,d N skl,+n1|skl,n+ 2ml g(sk1:L,n),ml (24) previous step k and h.iνl,d|tkl,d,Θk represents the expectation (cid:18) l,n l,n(cid:19) with respect to ν tk ,Θk. For simplicity, hereafter we use One cycle of the Metropolis-Hastings algorithm embedded only . to represle,dn|t tlh,dis expectation. The parameter ν is a l,d inthemainalgorithm,foreachsourceimage,isgiveninTable hiddehni(or latent) variablethat changesthe scale of the Gaus- II. sian density (t 0,δk I /ν ) and has a Gamma prior 1) Diffusion Matrix: In this section, we give a method (ν βk /2,Nβk l/,d2|).Byl,dexNploilt,idngν , we can definethe t- to find an optimum diffusion matrix D. The method must G l,d| l,d l,d l,d distributionasascalemixtureofGaussiansasin(8).Thesec- jeonisnutrceontdhiattionthaeldipsrtoridbuuctieodnpsa(sm,pvle yskl+1,scomes,Afro,Θm,Mthe) ondtermontherighthandsideof(29),−hlogp(νl,d|tkl,d,Θk)i, l l 1:K (1:L)−l correspondstotheentropyoftheposteriordensityofν ,and | l,d introduced in (12). If we write the Taylor expansion of is independent of the unknowns, and the function E(sk) with the infinitesimal ∆s and take the expec- l l tation of both sides with respect to the joint density Q(Θ;Θk)= log p(t Θ)p(β )p(δ ) . (30) l,d l,d l,d h { | }i 8 IEEETRANSACTIONONIMAGEPROCESSING,VOL.?,NO.?,???????201? TABLEIII The aim is to find the maximum of Q(Θ;Θk) with respect ONECYCLEOFADAPTIVELANGEVINSAMPLERFORSOURCE to Θ; SEPARATION.THESYMBOL←−DENOTESANALYTICALUPDATE,THE Θk+1 =argmaxQ(Θ;Θk) (31) SYMBOL←−0DENOTESUPDATEBYFINDINGTHEZEROROOTANDTHE Θ SYMBOL∼DENOTESUPDATEBYRANDOMSAMPLING. In the E (expectation) step of the EM algorithm, we must Findtheinitial mixingmatrix(i.e.FDCCA[31]). cfianldcutlhaeteptohsetereixopredcetantsiiotny oh.fiννll,,dd|tkl,d,Θk. For this purpose, we FIniintdialtihzeeitnhietiaplarsaomurecteerismαag0l,eds,uβsl0i,ndgatnhdeδLl0,Sdsolution. forallsourceimages,l=1:L p(νl,d|tkl,d,Θk)=p(tkl,d|Θk,νl,d)p(νl,d) foralldirections, d=1:D =G =νl,Nd|N(t/kl,2d,|0φ,dδ(lks,dklI,Nα/kl,νd)l,/dδ)lGk,d(νlG,d|βνlk,ld,d/|2β,lk,βdlk/,d2/,2β)lk,d/2 hνl,di←− NsTβ+Glkβ,dTlk,sd (cid:18)1+ φdβ(lks,kld,δαlk,kld,d)(cid:19)−1 The (cid:16)ex=peGct(cid:18)atνiol,nd|oNf+ν2βl,lkd,di,sβl2k,d (cid:18)1(cid:17)+ φ(cid:16)dβ(lks,kld,δαlk,kld,d)(cid:19)(cid:19). ((cid:17)32) +βαδlll,,,ddβdl←←,←d−−−−10hsν−[Tll−,lGd0ψi.Td01φ0dG(d2β(dlls2=s,lNdl,α)0l]+,d)logβl,d+hlogνl,di−hνl,di −1 βl,d N +βk φ (sk,αk ) forallpixels, n=1:N In the hMνl,d(im=aximβizlka,dtilo,dn) 1ste+p, d(β30lk,l)dδilk,sdl,dma!ximized w(3i3th) Uskls,+nin1g∼Metrpo(psoll,ins-|Hy1a:sKtin,gΘst−msel,tnho)dinTableII forallelements ofnthemixingmatrix,(k,lo)=(1,1):(K,L) respect to Θ. To maximize this function, we alternate among the variables αl,d, βl,d and δl,d. After taking the logarithms ak,l←− sTl1slsTl (yk− Li=1,i6=lak,isi)u(ak,l) and expectations in (30), the cost functions for αl,d, βl,d and P δ are written as follows l,d D. Adaptive Langevin Sampler Algorithm φ (s ,α ) Q(α ;Θk)= ν d l l,d (34) l,d −h l,di 2δ The proposed Adaptive Langevin Sampler algorithm is l,d givenin Table III. The symbol denotesanalyticalupdate, Q(δ ;Θk)= N logδ ν φd(sl,αl,d) (35) the symbol denotes updat←e−by finding the zero root and l,d l,d l,d 0 − 2 − h i 2δ ←− (cid:18) l,d (cid:19) the symbol denotes the update by random sampling. The ∼ Q(β ;Θk) = logΓ(βl,d)+ N+βl,d 1 logν sampling of the sources is done by the Metropolis-Hastings l,d − 2 2 − h l,di scheme given in Table II. To deal practically with uniformly +βl,d−1log(cid:16)β N+βl,(cid:17)d log2 2 l,d− 2 distributed positive variables, we assume that they lie in the hνl,diβlk,d 1+ φd(sl,αl,d) 0.002β range [0.0001,1000]. − 2 (cid:18) βlk,dδlk,d (cid:19)− l,d 1) Initialization: We start the algorithm with the mixing (36) matrix obtained by the FDCCA (Fourier Domain Correlated The solutions to (34) and (35) can be easily found as Component Analysis) [31] method. The initial values of as- sTGTs α = l d l (37) trophysical maps are obtained by Least Square (LS) solution l,d sTl GTdGdsl with the initial mixing matrix. The initial values of αl,d can φ (s ,α ) be calculated directly from image differentials. We initialized δl,d = νl,d d l l,d (38) theβ0 =0andfoundtheinitialvalueofδ byequalingthe h i N l,d l,d The maximization of (36) does not have a simple solution. expectation(33)toaconstant.Inthisstudy,wetaketheinitial It can be solved by setting its first derivative to zero: value of this posterior expectation 1.5. So the initial value of δ0 =1.5φ (s0,α0 )/N ψ (βl,d)+logβ + logν ν l,d d l l,d − 1 2 l,d h l,di−h l,di (39) 2) Stopping Criterion: We observe the normalized abso- +βl,d−1 0.002=0 lute difference between sequential values of s to decide βl,d − l the convergence of the Markov Chain to an equilibrium. where ψ (.) is the first derivative of logΓ(.) and it is called 1 If sk sk−1 /sk−1 10−2, we assume the chain has digamma function. | l − l | | l | ≤ converged to the equilibrium for s and denote this point l C. Parameters of the Mixing Matrix Tl = k. Since we have L parallel chains for L sources, the ending point of the burn-in period of the whole Monte Carlo We assume that the prior of A is uniform between 0 chain is T =max T . We ignore the samples before T . We and . The conditional density of a is expressed as s l l s k,l p(a ∞y ,Θt ) p(y Θt). From (2), it can be seen keeptheiterationgoinguntilTe thatistheendingpointofthe k,l| 1:K −ak,l ∝ 1:K| post burn-in period simulation. In the experiments, we have that the conditional density of a becomes Gaussian. The k,l used 100 iterations after burn-in period, so T =T +100. parameter a is estimated in each iteration as e s k,l L 1 a = sT(y a s )u(a ) (40) VI. SIMULATION RESULTS k,l sTs l k− k,i i k,l l l i=X1,i6=l Totestourprocedure,weassumenineobservationchannels where u(a ) is the unit step function. with center frequencies in the range 30–857 GHz, where the k,l KAYABOLetal.:ADAPTIVELANGEVINSAMPLERFORSEPARATIONOFT-DISTRIBUTIONMODELLEDASTROPHYSICALMAPS 9 Original LS (Initial) ALS−t GS−MRF 40 0.2 B CM 0 35 −0.2 30 Synchrotron 0000....000011112468 psir [dB]25 Ts 0.01 20 CMB Synchrotron 0.03 Dust Dust 00..0012 150 50 100Iteration numbe1rs50 200 250 Fig.4. ThePSIRsofthesourcesinthepixeldomainasafunctionofiteration number.Thevertical linesignifiesthestartingpointTs. Fig.3. Theseparatedastrophysicalimagesfromnoisyobservationswiththe LS,ALS-tandGS-MRFmethods.Thelocation ofthepatchis0◦ longitude TABLEIV and40◦ latitude, outoffgalactic plane,andhasasize64×64pixels. THEPSIR(DB)VALUESOFTHESEPARATEDCOMPONENTSANDTHE PROCESSTIMEOFTHEALGORITHMSINMINUTES. CMB Synchrotron Dust time dominant diffuse radiations are the CMB, the galactic syn- LS 30.69 15.03 37.37 1.32e-4 ICM 26.27 17.64 35.30 0.31 chrotronradiationandthethermalemissionfromgalacticdust. GS-MRF,[1] 27.81 22.33 38.93 226.72 Except for CMB, these radiations have unknown emission ALM-MRF,[22] 27.91 20.88 36.41 2.86 spectra (that is, the coefficientsa in (1) are not all known). ALS-t 33.45 26.21 40.51 1.65 k,l Both observation models in (1) and (5) are suitable for our algorithm.Intheexperiments,weassumethemodelin(1),that isofinstantaneousmixturesandisotropicsky,to bevalid.We execution time of ALS-t is two orders of magnitude smaller plan to attack the problems of space variability and channel- than that of GS-MRF. The PSIR values of ALS-t are also dependent convolutional effects in the future. over those of LS, ALM-MRF, GS-MRF and ICM, especially Inthesequel,wepresentastrophysicalimageseparationre- for synchrotron, and furthermore, the smoothing degradation sultsona comparativebasis. Theproposedmethodisdenoted of ICM on the synchrotron component is not observed in the as ALS-t (Adaptive Langevin Sampler-t-distribution) and is proposed method , Fig. 3. compared to four other methods, namely: 1) GS-MRF, which is the MRF model coupled with Gibbs sampling [1]; 2) LS, Original LS (Initial) ALS−t which formsourinitialestimates on the basisof the valuesof 0.2 a obtained by FDCCA [31]; 3) Iterated ConditionalModes B k,l M 0 C (ICM), whichmaximizesthe conditionalpdfssequentiallyfor −0.2 eachvariable[11];4)ALM-MRF,whichisthesolutionofthe 29.39 dB 31.19 dB MRF model via Langevin and Metropolis-Hastings schemes [22T]h.e leftmost column of Fig. 3 shows the ground-truth Synchrotron 00..115 simulated astrophysical source maps. The remaining columns 0.05 40.08 dB 44.62 dB show the source maps separated by LS, ALS-t and GS-MRF, 20 respectively.Theskypatchusedforthisexperimentiscentered 15 at 0◦ longitude and 40◦ latitude in galactic coordinates and Dust 10 has a size of 7.3 7.3 square degrees in the celestial sphere, 5 × discretized in a 64 64 pixel map. 95.05 dB 96.83 dB × The Peak Signal-to-Interference Ratio (PSIR) is used as a numerical performanceindicator. The PSIR can be calculated Fig.5. Theseparatedastrophysicalimagesfromnoisyobservationswiththe if the ground-truth is known, which is the case in our work LS and ALS-t methods. The location of the patch is 20◦ longitude and 0◦ since all sky components are simulated. For this patch, the latitude, galactic plane, andhasasize128×128pixels. algorithm converges after 155 iterations and uses a total of 255 iterations to reach the solution (see Fig. 4). We compare TABLEV the results with the ones of LS, ICM, GS-MRF and ALM- THEPSIRIMPROVEMENTS(DB)WITHRESPECTTOINITIALLS MRF. SOLUTION. Table IV lists the PSIR values and the process times. The CMB Synchrotron Dust simulationsarerunonaCore2CPU1.86GHzPC.Theprocess (0◦,40◦) 3.01 10.01 4.08 time of ALS-t is much shorter than that of the GS-MRF. The (20◦,0◦) 1.80 4.54 1.78 10 IEEETRANSACTIONONIMAGEPROCESSING,VOL.?,NO.?,???????201? TABLEVI THEPSIRspec(DB)VALUESOFTHESEPARATEDCOMPONENTS,INTHE 12000 CMB Ground−truth 12000 CMB Ground−truth ANNULARFREQUENCYDOMAIN. 10000 LASLS−t 10000 LASLS−t 8000 8000 LS C30M.3B3 0S◦2y,.n46c05h◦. 3D7u.7st6 C31M.5B9 2S40y5◦n.,6c09h◦. 3D9u.3st5 π(l+1)lC/246000000 π(l+1)lC/246000000 ALS-t 35.87 26.23 40.69 34.53 46.98 38.93 2000 2000 00 200 400 600 80l0 1000 1200 1400 1600 00 200 400 600 80l0 1000 1200 1400 1600 (a) CMB,patch(0◦,40◦) (b) CMB,patch(20◦,0◦) Wehavealsorunthealgorithmfor128 128pixelspatches × centeredat(0◦,40◦)and(20◦,0◦).Fig.5showstheresultsfor Synchrotron Synchrotron thepatch(20◦,0◦).Inthatpatch,therelativeintensityofCMB 1089 GLASLroSu−ntdtruh 33050000 GLASLroSu−ntd−truth istheweakestone.ThePSIRvaluesoftheestimatesofLSand 7 2500 AarLeSl-isttaerdefworritttheantupnadtcehrtahnedmfaoprstahnedptahtechPS(0IR◦,i4m0p◦)roivnemTaebnltes π(l+1)lC/23456 π(l+1)lC/2112050000000 2 V. The total time of the ALS-t algorithm for the 128 128 1 500 size patches is about 5.31 minutes. × 00 200 400 600 80l0 1000 1200 1400 1600 00 200 400 600 80l0 1000 1200 1400 1600 We also use an alternative performance criterion, defined (c) Synchrotron, patch(0◦,40◦) (d) Synchrotron, patch(20◦,0◦) in the spherical harmonic (frequency), ℓ, domain, since the angular power spectrum is relevant to astrophysics. If we 12 Dust Ground−truth 3.5x 105 Dust Ground−truth decompose a CMB map on spherical harmonics, the com- 10 LASLS−t 3 LASLS−t 2.5 pthlee1xancogeuℓflfiarciepncotws,ecrc∗ℓsmp.e(cℓtru=m,0,C1(,ℓ2),,..a.s, mthe∈av[e−raℓg,eℓ])C, (dℓe)fin=e π(l+1)lC/2468 π(l+1)lC/21.125 2ℓ+1 m=−ℓ ℓm ℓm 2 0.5 In Fig. 6, we plot the standard power spectrum, C(ℓ), P 00 200 400 600 80l0 1000 1200 1400 1600 00 200 400 600 80l0 1000 1200 1400 1600 defined as C(ℓ) = (ℓ+1)ℓC(ℓ)/2π of the original and the (e) Dust,patch(0◦,40◦) (f) Dust,patch(20◦,0◦) reconstructed sources in the two patches considered. In order to compare different methods, we also introduce the Peak Fig. 6. Ground-truth and estimated angular power spectrums for patches Signal-to-InterferenceRatio in the ℓ-domain defined as (0◦,40◦) and (20,0). The ground-truth spectrum (solid line), spectrum of LSsolution(dotline)andspectrumofALS-tsolution(solidlinemarkedwith +). √N/2+1 max(C(ℓ)) PSIR =20log × (41) spec q  C(ℓ) C(ℓ) || − || ated through the Langevin stochastic equation. The proposed   where C(ℓ) is the estimated power spectbrum. algorithm provides two orders of magnitude computational In the off-galactic patch considered in Fig. 6, the intensity economy vis-a`-vis the Gibbs sampling approach. In addition, ofsyncbhrotronisverylowandtheLSsolutionforsynchrotron it generates better source separation as compared to all its is contaminated too much by noise. The estimated CMB and competitors,i.e.,LS,ICM,ALM-MRFandGS-MRFmethods thedustspectrumsbyALS-tfollowtheground-truthspectrum measured in terms of PSIR in the pixel domain and PSIR spec better than the LS one, especially in the high frequency in the annular frequency domain. The algorithm can recon- regions. For the patch (20◦,0◦), synchrotron and dust are structthe highfrequencyregionsofthepowerspectrumswith estimated adequately by LS, but the LS estimate of CMB is higher fidelity. A byproductof this approach is the capability improvedbyALS-t.TherelatedPSIR valuesarepresented to estimate the parameters of the t-distribution image priors. spec in Table VI. AlthoughtheproposedALS-tmethodtakeslongerthaneither We have observed that the estimated regression parameter ICMorLSmethods,itssuperiorperformancebyfaroutweighs α isquiteisotropicforallthemaps.FortheCMBmapinthe this disadvantage, and furthermore the algorithm lends itself l,d (0◦,40◦) patch, its value is about 0.88 for all d. In the same to parallelprocessing.To improvethe algorithmperformance, patch, the values of the parameter α for synchrotron and non-stationaryimage priors, more efficientdiscretization time l,d dust are 0.99. These results show us that the CMB radiation step and diffusion matrix can be investigated in the future. isspatiallylesscorrelatedthantheotherradiationsources.We Another point to obtain a beneficial algorithm might be the assume the parameter β is isotropic and estimate a single usage of more than one MH-steps, because the EM algorithm l,d value for each direction. The EM estimation of β depends which estimates parameters converges faster than the Monte l,d too much on its prior and initial value. We have allowed Carlo sampling scheme. the parameter δ to be anisotropic, but at the end of the Our new goal is the application of the proposed algorithm l,d estimation steps we have found that it is almost isotropic for to whole-sky maps. To avoid the difficulties inherent in this all radiation maps. problem, we plan to use the ”nested numbering” structure providedbytheHEALPix[37]package.Inthisformat,wecan VII. CONCLUSION ANDFUTURE WORK reach the indexes of the eight neighbors of each pixel on the We have developeda Bayesian source separation algorithm sphere. To calculate the pixel differences, we will implement forastrophysicalimageswheretheMCMCsamplesaregener- a gradient calculation method on the sphere by taking the

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.