ebook img

A Bayesian technique for the detection of point sources in CMB maps PDF

0.24 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview A Bayesian technique for the detection of point sources in CMB maps

Mon.Not.R.Astron.Soc.000,1–9(2010) Printed10January2011 (MNLATEXstylefilev2.2) A Bayesian technique for the detection of point sources in CMB maps 1 F. Argu¨eso1⋆, E. Salerno2, D. Herranz3, J. L. Sanz3, E. E. Kuruog˘lu2 and K. Kayabol2 1 0 1 Departamento de Matem´aticas, Universidad de Oviedo, 33007, Oviedo, Spain 2 2 CNR Istituto di Scienza e Tecnologie dell’Informazione, viaG. Moruzzi 1, I-56124, Pisa, Italy n 3 Institutode F´ısica de Cantabria, CSIC-UC, Av. los Castros s/n, Santander, 39005, Spain a J 7 Received–,Accepted – ] O ABSTRACT C The detection and flux estimation of point sources in cosmic microwave background h. (CMB) maps is a very important task in order to clean the maps and also to obtain p relevant astrophysical information. In this paper we propose a maximum a posteriori - (MAP)approachdetectionmethod ina Bayesianschemewhichincorporatespriorin- o formationaboutthe sourceflux distribution,thelocationsandthe numberofsources. r t We apply this method to CMB simulations with the characteristics of the Planck s satellite channels at 30,44, 70 and 100 GHz. With a similar level of spurious sources, a [ ourmethodyieldsmorecompletecataloguesthanthematchedfilterwitha5σ thresh- old. Besides, the new technique allows us to fix the number of detected sources in a 1 non-arbitrary way. v 6 Key words: methods: data analysis – techniques: image processing – radio contin- 5 uum: galaxies – cosmic microwave background 4 1 . 1 0 1 INTRODUCTION tronomy, with a more complete list of references, can be 1 found in Herranz& Vielva(2010). 1 The detection and estimation of the intensity of compact v: objects embedded in a background plus instrumental noise When a MF or a wavelet is applied to a CMB map in i is a problem of interest in many different areas of sci- the blind detection case, i.e. when it is assumed that the X ence and engineering. A classic example is the detection number of point sources, their positions and fluxes are un- r of point-like extragalactic objects –i.e. galaxies– in sub- known,themost common method for detection is based on a millimetric Astronomy. Regarding this particular field of the well-known idea of thresholding: the maxima of the fil- interest, different techniques have proven useful in the lit- tered map above a given threshold are selected and consid- erature. Some of the existing techniques are: the standard ered as the positions of the sources, so that the number of matched filter (MF, Nailong 1992), the matched multifil- detectedsourcesisthenumberofmaximaabovethatthresh- ter (Herranz et al. 2002; Lanz et al. 2010) or the recently old.Thefluxesareestimated thenbyusingthecorrespond- developed matched matrix filters (Herranz& Sanz 2008). ing estimation formulas with the MF or the wavelet. The Other methods include continuous wavelets like the stan- valueofthisthresholdremainsarbitrary,thougha5σ cutis dard Mexican Hat (Sanzet al. 2006) and other members oftenapplied,sinceitguaranteesthatunderreasonablecon- of its family (Gonz´alez-Nuevo et al. 2006). All these filters ditions a few detected sources are spurious. Apart from the have been applied to real data of the Cosmic Microwave arbitrariness of this procedure, theprior knowledge regard- Background (CMB), like those obtained by the WMAP ingtheaveragenumberofsourcesinthesurveyedpatch,the satellite (L´opez-Caniego et al. 2007) and CMB simulated fluxdistributionofthesesourcesorotherpropertiesarenot data (Leach et al. 2008) for the experiment on board the used,so that useful information is being neglected. Planck satellite (Tauber 2005). Besides, Bayesian methods Bayesiandetectiontechniquesprovideanaturalwayto have also been recently developed (Hobson & McLachlan take into account all the available information about the 2003;Carvalho et al.2009).Amoredetailedreviewonpoint statistical distribution of both the sources and the noise. source detection techniques in microwave and sub-mm As- Unfortunately, up to this date only a few works have ad- dressedtheproblemofdetectingextragalacticpointsources in CMB data (Hobson & McLachlan 2003; Carvalho et al. 2009). The reason for this is twofold: on the one hand, the ⋆ E-mail:[email protected] statisticalpropertiesofextragalacticsourcesatsub-mmfre- 2 Argu¨eso et al. quencies are still very poorly known. On the other hand, iscomputationalandalgorithmiccomplexity.Dependingon mappingthefullposteriorprobabilitydensityofthesources thechoiceofpriorsandthelikelihoodfunction,thefullpos- isoften verydifficultandcomputationally expensive.These terior distribution of the parameters of the sources may be two problems explain, at least partially, the predominance very complex and in most cases it is impossible to obtain of frequentist over Bayesian methods in the literature. Let maximumaposteriori (MAP)valuesoftheparameters and usconsider theprevious two problems separately: their associated errors via analytical equations. Numerical The microwaveand sub-mmregion hasbeen untilvery sampling techniques such as Monte Carlo Markov Chain recently oneof thelast unchartedareas in astronomy.Con- (MCMC) methods are required in order to solve the infer- cerningextragalactic sources, thisregion of theelectromag- ence problem, but these methods are computationally in- netic spectrum is where the total number of counts passes tensive. It is thus necessary to apply computing techniques from beingdominated byradio-loud galaxies tobeingdom- specifically tailored for accelerating the convergence and inated bydustygalaxies. Althoughaminimum oftheemis- improving the efficiency of the sampling (Feroz & Hobson sion coming from extragalactic sources isexpected tooccur 2008) and/or to find smart approximations of the poste- around 100–300 GHz, they are still considered as the main riornearitslocal maxima(Carvalho et al.2009).Butthese contaminant of the CMB at small angular scales at these enhancements have the cost of increasing dramatically the frequencies(Toffolatti et al.1998;de Zotti et al.2005).The algorithmic complexity of the detection software, introduc- uncertainties about thenumbercounts at intermediate and ing new layers of intricacy in the form not only of addi- lowflux,redshiftdistribution,evolutionandclusteringprop- tional assumptions and routines, but also of regularization ertiesof thismixed population ofobjects arelarge. Inmost ’constants’,hiddenvariables,hyperparametersandselection cases this has motivated the use of noninformative priors, thresholds that in many cases must be fine-tunedmanually which avoid to make adventurous assumptions about the in order to be adapted to the specific circumstances of a sourcesbutontheotherhandmisspartofthepowerofthe givendataset.Thecomplexityofthealgorithmscanriseto priorsthatarebasedonobservationsandphysicalintuition. almostbaroquelevels,havinganegativeeffectontheporta- But,inspiteofwhathasbeensaidabove,ourknowledge bility of thecodes and on thereproducibility of theresults. about the statistical properties of point sources is growing We propose in this paper a simple strategy based on day by day thanks to the new generation of surveys and Bayesian methodology which incorporates sensible prior in- experiments. In the high-frequency radio regime, WMAP formation about the source locations, the source fluxes and observations are in agreement with the de Zotti model the source number distribution. With these priors and as- (deZotti et al.2005;Gonz´alez-Nuevo et al.2008).Priorsfor sumingaGaussianlikelihood,wecanobtainanexplicitform thenumberdensityandfluxdistributionsintherangeoffre- of the negative log-posterior of the number of sources and quencies>5GHzaremoreandmorereliablethankstothe their fluxes and positions. Assuming a MAP methodology, information provided by recent surveys such as CRATES we introduce a straightforward top-to-bottom detection al- at 8.4 GHz (Healey et al. 2007), the Ryle-Telescope 9C at gorithmthatallowsustodeterminethenumber,fluxesand 15.2 GHz (Taylor et al. 2001; Waldram et al. 2003) or the positionsofthesources.Wegiveasimpleproofthatthepo- AT20Gsurveyat20GHz(Ricciet al.2004;Massardi et al. sitionsofthesourcesmust belocatedinthelocalmaximaof 2008; Mahony et al. 2010). For a recent review on radio thematched-filtered image if thereis not asignificant over- andmillimiter surveysand theirastrophysical implications, lap between sources. The main computational requirement see de Zotti et al. (2010). The situation is worse in the far- of our algorithm is the solution of a system of non-linear infrared part of the spectrum, where relatively large uncer- equations. Our method differs from the one presented by tainties remain in the statistical properties, the evolution Carvalho et al. (2009) in fivemain points: and, above all, the clustering properties of dusty galaxies. (i) We use a more realistic set of priors for the source Most of the existing dusty galaxy surveys have been car- number, intensity and location distributions. In particular, ried out in the near and medium infrared with IRAS, ISO our choice of the prior on the locations is also flat but de- and Spitzer, but the wave-band from 60 to 500 µm is still pendson thenumberof sources n, which later proves tobe virtually terra incognita. The only survey of a large area of decisive for the log-posterior. the extragalactic sky at a wavelength above 200 µm is the (ii) We obtain an explicit form of the negative log- one recently carried out by the Herschel pathfinder exper- posterior and an explicit solution of the MAP estimate of iment, the Balloon Large Area Survey Telescope (BLAST, thesource intensities as the solution of a non-linear system Devlin et al.2009).Inthenextfewmonths,however,thelu- of equations. minosityfunctionandthedust-massfunctionofdustygalax- (iii) We prove that, for non-overlapping sources and a ies in the nearby Universe will be much better understood Gaussian likelihood, the MAP estimation of the positions thanks to the Herschel-ATLAS Survey (Eales et al. 2010), of the sources is given by the location of the local maxima whichcoversthewavelengthrangebetween110and500µm of thematched filtered images. andhasalreadyproducedinterestingresultsduringtheHer- (iv) Since we are interested only in point sources, we fix schel Science Demonstration Phase (Clements et al. 2010). thesize parameter of the objects to be detected. Thankstotheseandthepreviouslymentionedobservations, (v) WecanalsofindtheMAPsolutionforthenumberof the sub-mm gap is narrowing and our knowledge of galaxy sources present in the images with a simple top-to-bottom populations in this wave band, albeit far from perfect, is search strategy. We do not need to resort to costly evalua- quicklyimproving. tions of the Bayesian evidence. Apart from the uncertainties on the priors, the other complication that has traditionally deterred microwave as- The layout of the paper is as follows: in section 2 we tronomersfromattemptingBayesianpointsourcedetection present the method and derive the corresponding posterior A Bayesian technique for the detection of point sources in CMB maps 3 which includes the data likelihood and the priors. In sec- the n×2 matrix R, containing all their coordinates. If we tion 3 we apply the method to CMB simulations with the wanttoadoptaBayesian strategytosolveourproblem,we characteristics oftheradio Planckchannels(from 30to100 must be able to write the posterior probability density of GHz,wherethenumbercountpriorsare most reliable) and ourunknowns.Asuitableestimation criterionmustthenbe compareitwiththestandardprocedureofusingaMFwith chosen. a 5σ threshold. The main results are also presented in sec- tion 3. The conclusions are given in the finalsection. 2.1 Posterior By the Bayes rule, the posterior we are looking for has the 2 METHODOLOGY following form In a region of the celestial sphere, we suppose to have an p(n,R,a|d)∝p(d|n,R,a)p(n,R,a) (5) unknown number n of radio sources that can be consid- where p(d|n,R,a) is the likelihood function, derived from ered as point-like objects if compared to the angular reso- our data model (4). To find the prior density p(n,R,a) we lution of ourinstruments.Thismeansthat theiractual size needtomakeanumberofassumptions.Let usfirstobserve is smaller than our smallest resolution cell. The emission of that, in principle, both R and a depend on n, through the thesesources is superimposed to aradiation f(x,y) coming numberof theirelements. Ontheotherhand,wecan safely from diffuseorextendedsources. Inourparticularcasethis assume that, once n is fixed, thefluxes a of the sources are radiation is the CMB plus foreground radiation. A model independent of their locations. These assumptions lead us for the emission as a function of the position (x,y) is to write n d˜(x,y)=f(x,y)+ aαδ(x−xα,y−yα), (1) p(n,R,a)=p(R,a|n)p(n)=p(R|n)p(a|n)p(n) (6) Xα=1 Thisexpressionisvalidwhenweconsiderextragalacticpoint sources,whosefluxesarenotrelatedtotheirpositions.This where δ(x,y) is the 2D Dirac delta function, the pairs will bethe case in this paper. (xα,yα) are the locations of the point sources in our region of the celestial sphere, and aα are their fluxes. We observe this radiation through an instrument, with beam pattern 2.2 Likelihood function b(x,y),andasensorthataddsarandomnoisen(x,y)tothe signal measured. Again, as a function of the position, the Asmentionedabove,thelikelihoodfunctionderivesfromthe output of our instrument is: physics associated to the assumed data model. In general, unfortunately,adatamodeloftype(2)isdifficulttodescribe n statistically.Wearegoingtoassumefromnowonthatǫisa d(x,y)= aαb(x−xα,y−yα)+(f∗b)(x,y)+n(x,y), (2) randomGaussianfieldwithzeromeanandknowncovariance Xα=1 matrix ξ. This is true if we only consider the CMB and the wherethepointsourcesandthediffuseradiation havebeen instrumentalnoise, excludingotherforegrounds. Inthispa- convolved with the beam. In our application, we are inter- perwewill dealwith zonesoftheskywheretheforeground ested in extracting thelocations and thefluxesof thepoint contributionisnotimportantorwheretheforegroundshave sources.Wethusassumethatthefluxesofthepointsources been conveniently removed by component separation tech- are sufficiently above the level of the rest of the signal plus niques.The likelihood is thus the noise, and consider the latter as just a disturbance su- perimposed to the useful signal. If ǫ(x,y) is the sum of the (d−φa)tξ−1(d−φa) p(d|n,R,a)∝exp − . (7) diffusesignal plus thenoise, model (2) becomes (cid:20) 2 (cid:21) n Observe that the negative of the exponent in (7) is in any d(x,y)= aαb(x−xα,y−yα)+ǫ(x,y). (3) case the squared ξ−1-norm fit of the reconstructed data to Xα=1 themeasurements,andthisalwayscarriesinformationabout If our data set is a discrete map of N pixels, the above the goodness of our estimate. However, if the Gaussian as- equation can easily be rewritten in vector form, by letting sumption is not verified, function (7) is not the likelihood d be the lexicographically ordered version of the discrete of our parameters, and when it is introduced in (5), we do mapd(x,y),abethen-vectorcontainingthepositivesource not obtain the posterior distribution we are looking for. In fluxesaα,ǫthelexicographically orderedversion ofthedis- oursimulations wewill also includetheconfusion noise due cretemap,ǫ(x,y),andφbeanN×nmatrixwhosecolumns to faint extragalactic sources. This confusion noise is not are the lexicographically ordered versions of n replicas of Gaussian but as we will see later, it does not hamper the themap b(x,y),each shifted ononeofthesourcelocations. detections, since its standard deviation is much lower than Equation (3) thusbecomes that of the CMB plus the instrumental noise for the fre- quenciesconsideredinthispaper.Forthesakeofsimplicity, d=φa+ǫ. (4) we will defer to further papers the treatment of the more Looking at equations (3) and (4), we see that, if the general case which includes theforegrounds. goal is to find locations and fluxesof thepoint sources, our unknowns are the number n, the list of locations (xα,yα), 2.3 Prior on source locations with α = 1,...,n and the vector a. It is apparent that, once n and (xα,yα) are known, matrix φ is perfectly de- A priori, it is reasonable to assume that all the different termined. Let usthen denote thelist of source locations by combinations of n distinct locations occur with the same 4 Argu¨eso et al. probability.Thenfunctionp(R|n) inEq.(6)canbewritten whereλ,theintensityofthePoissonvariable,istheexpected as number of sources in the map at hand. The value of λ will n!(N −n)! depend on the flux detection limit am, the size of the map p(R|n)= , (8) and thewavelength of theobservation. N! sinceN!/(n!(N−n)!)isthenumberofpossibledistinctlists ofnlocationsinadiscreteN-pixelmap.Thisassumptionis 2.6 An explicit expression for the negative based on the fact that the sources considered in this paper log-posterior are spatially uncorrelated. If we multiply all the factors which appear in (5) and (6) andcalculatethenegativelog-posterior,wefind(apartfrom additiveconstants) 2.4 Prior on fluxes 1 L(n,R,x) = xtMx−2etx −log(N−n)!−nlog(λ) Experimentally, it has been found that the fluxes of the 2 (cid:0) (cid:1) strongestsourcesareroughlydistributedasanegativepower 1 γ−1 1 law, with exponent γ. Conversely, the weak sources have − nlog(p)+nlogB(cid:18)1+xpm; p ,p(cid:19) fluxes that are roughly uniformly distributed. To include n these two behaviors into a single formula, one should first + γ log(1+xp), (12) discriminateinsomewaybetweenweakandstrongsources. p α Xα=1 This can be done empirically, by establishing a sort of threshold a0 on the fluxes and a conditional prior with the with M = a20φtξ−1φ and e = a0φtξ−1d. The correlation form of theGeneralized Cauchy Distribution (Rider1957): matrix ξ is computed by using the Cℓ’s obtained from the WMAP five-year maps (Nolta et al. 2009) and adding the p(a|n)∝ n 1+ aα p −γp , (9) instrumental noise. We assume that we know a0, p, xi, γ αY=1h (cid:16)a0(cid:17) i andλ,infactwecalculatethembyusingthedeZotticounts model (deZotti et al. 2005). Therefore, the unknowns are: with p a positive number. Distribution (9) obviously as- thenormalized fluxesx, thenumberof point sources n and sumes that the fluxes of the different sources are mutually thepositionsofthepointsourcesthroughthematrixφ(R). independent.Thispriorclearly showsthebehaviorrequired Letusnowexaminethestructureoffunction(12).The for strong and weak sources. In order to work with non- firsttermcomesfromthelikelihood,andobviouslydecreases dimensional quantities, we define xα = aα/a0; we also as- asmuchasoursolutionfitsthedata.Thesecondtermtakes sume that we will detect point sources above a minimum into account the prior for the source configuration and pe- fluxam, that leads to thefollowing normalized distribution nalizes large values of n. The third term comes from the prioronthenumberofsourcesand,dependingonthevalue n p(x|n)= p (1+xpα)−γp , xαǫ[xm,∞) of λ, favors (λ>1) or disfavors (λ<1) the increase of the αY=1B(cid:16)1+1xpm;γ−p1,p1(cid:17) nonumthbeersooufrscoeuflrucexse.sTcohnedfoitliloonweidngtotenr,mthsecolamsteofrnoemintthroedpurcioesr (10) an additional cost as soon as a new source is added to the where B is the incomplete beta function and xm =am/a0. solution. If N ≫ λ and N ≫ n, what is typical in CMB Inthenextsection, we determinethevalues ofa0,pand γ, maps, it can be proven by using Stirling’s approximation byfittingthisformulatothepointsourcedistributiongiven that the second term is the dominant one coming from the by thedeZotti counts model (deZotti et al. 2005). priors.Inthenextsection,wewill analyzewith simulations thecontribution of each particular term. 2.5 Prior on the number of sources 2.7 Maximum a posteriori (MAP) solution We need to establish a discrete probability distribution that expresses the probability of a number of occurrences Formula (12) includes all the information about the po- in a fixed domain once their average density is known, sitions, fluxes and number of sources. In order to obtain their locations in the domain are mutually independent, concrete results, we will choose the values of R, x and n and no pair of sources can occur in the same location. which maximize the posterior. This choice will be justified All these assumptions seem reasonable when applied to bymeansofthesimulationsandresultsthatwewillpresent the configurations of the point sources in the celestial in the nextsection. sphere, at least for radio-frequencies (Argu¨eso et al. 2003; Therefore,regardingthefluxweminimize(12)withre- Gonz´alez-Nuevo et al. 2005). Assuming a continuous map spect to x, by taking the derivative and equating to zero domain,all therequirementsmentioned aresatisfied bythe and we obtain Poisson distribution. Strictly speaking, we have a discrete n γxp−1 N-pixelmap,soabinomialdistributionshouldbeused,but Mαβxβ−eα+ 1+αxp =0. (13) if N is not too small, a Poisson distribution should model Xβ=1 α correctly the probability of having n occurrences of point Bysolving(13)numericallywewouldobtaintheestimatorof sources. The prior on n appearing in (6) is thus xwhichyieldsthemaximumposteriorprobability.However, λne−λ weknowneitherthenumberofsourcesnortheirpositions.In p(n)= , (11) order to determine the positions, we assume that the point n! A Bayesian technique for the detection of point sources in CMB maps 5 sources arein thelocal maxima ofe,whichis thematched- 30 GHz filtered map of the original data. In the following, we will 4 show that this assumption can be safely adopted, since the De Zotti counts minimaofLmustbeinthemaximaofthematched-filtered 3 Fit map (we also remark that the matched filter is not intro- duced ad hoc, but it appears naturally as a part of the for- 2 malism). ds) For simplicity, let us assume that we have only one n/ source, in this case the terms of L which depend on the 10(d 1 g o fluxcan be written l 0 L(x)= M11x2 −e1x+ γ log(1+xp), (14) 2 p −1 whereM11 and e1 arethecorrespondingvaluesof Mand e −2 at the pixel supposedly occupied by the point source. If we −1 −0.5 0 0.5 1 log10(Flux Jy) take the derivative of (14) with respect to x and equate to zero, we obtain the following equation for xˆ, the estimator Figure 1. log10 of the differential counts plotted against the of x: fluxforthedeZottimodel(greenline)andthefittotheextended power-lawgivenby(9)withtheparametersofTable1(blueline). γxp−1 M11x+ 1+xp =e1⇒xˆ=xˆ(e1). (15) Thetwolinesarenearlyindistinguishable. Ifwesubstitutethelastexpressionineq.(14),wecanwrite 3 SIMULATIONS AND RESULTS theexpression for thenegative log-posterior L(xˆ(e1)) 3.1 The simulations L(xˆ(e1))=−M11xˆ2 − γxˆp + γ log(1+xˆp). (16) Inordertochecktheperformance ofthenewtechnique,we 2 1+xˆp p have carried out simulations including CMB, instrumental noiseandpointsources.Thesimulationshavethecharacter- By taking the derivative of this formula with respect to e1, istics of the 30, 44, 70 and 100 GHz channels of the Planck we finally find satellite: pixel size, beam width and instrumental noise1. The simulations are flat patches of 32×32 pixels (30 and dL = dL de1 −1=−xˆ(e1). (17) 44 GHz) and 64×64 pixels (70 and 100 GHz), so that the de1 dxˆ (cid:16)dxˆ(cid:17) size of each patch is 3.66×3.66 square degrees. In order to avoid border effects, we simulate patches of four times this where we have calculated the derivative in (15). Since the size and keep the central part for our analysis. The small estimated value of the flux must be positive, this expres- size of the simulations allows us to do our calculations in a sion shows that the negative log-posterior at the estimated fast way. Weperform 1000 simulations for each channel. valueofxdecreaseswithe1,soitisminimumatthehighest TheCMBmapshavebeengeneratedbyusingthepower valueofe1i.e.atthemaximumofthematched-filteredmap. spectrum,theC ’s,thatproducesthebestfittotheWMAP Therefore, the posterior, calculated at the estimated flux ℓ five-year maps (Nolta et al. 2009), we have also added the value,ismaximumwhenweassumethatthepointsourceis instrumental noise of the 30, 44, 70 and 100 GHz chan- atapeakofthemap.Thisconclusionisvalidifwehavemore nelsofthePlancksatellite.Finally,wehavesimulatedpoint than one source, provided that there is no overlap between sources, by taking into account the flux distribution pre- sources,i.e. theareas wheretheindividualimages ofallthe dictedbythedeZottimodel(deZotti et al.2005).Wehave sourcesarenonzeromustbecompletelydisjoint, becausein includedthefainttailofthedeZottidistribution,simulating this case each source can be treated individually. point sources from 0.01 mJy on. In this way, we have con- In order to determine the number of sources, we sort sideredtheconfusion noiseduetounresolvedpointsources. these local peaks from top to bottom and solve (13) suc- Thestandarddeviationofthisconfusionnoiseismuchlower cessively adding a new source. At the same time, we calcu- thanthatoftheCMBplusinstrumentalnoiseinthesechan- late the negative log-posterior (12) and choose the number nels. n of sources which produce the minimum value of (12). In For each simulation we consider the negative log- this way we have constructed an objective stopping crite- posteriorgivenby(12).Inthisequationweseeseveralmag- rion which yields, by combining (12) and (13), the number ofsourcesandtheirfluxeswhichmaximizetheposterior.In nitudes, γ, a0, am and λ, which depend on the frequency. By fitting the number counts given in the de Zotti model the next section, we apply the method explained above to by (9), we calculate γ and a0. We have taken p=1 in (9), the detection and flux estimation of point sources in CMB since the goodness-of-fit obtained by changing p is not bet- maps. ter than that of the particular case p = 1. The parameter In order to compare this technique with a standard method, we also calculate the local peaks above a certain threshold, for instance a 5σ threshold, and solve (13) with 1 For details on the Planck instrumental and γ =0, that amounts to using a MF, i.e. a maximum likeli- scientific performance, see the Planck web site hood estimator. http://www.rssd.esa.int/index.php?project=PLANCK. 6 Argu¨eso et al. frequency γ a0 λ 30 GHz 30 GHz 2.90 0.19 0.69 −6078 44 GHz 2.87 0.15 0.53 −6080 all priors 70 GHz 2.87 0.15 0.49 no flux prior, no Poisson no flux prior 100 GHz 2.87 0.15 0.47 −6082 no Poisson Table1. Valuesofthepower-lawexponentγ andthefluxa0 in −6084 Jy, as obtained by fitting the De Zotti counts. λ is the average −6086 numberofpointsourcesabove0.25Jyintheconsideredpatches. L −6088 λ is theaverage numberof sources per patch and am is the −6090 minimumfluxthatweconsiderinourdetectionscheme,we −6092 have chosen am =0.25 Jy, the typical rms deviation of the CMB plus noise maps at the frequencies we consider. The −6094 values of γ, a0 and λ are shown in Table 1 for the different −6096 frequencies. In Figure 1 we show as an example the fit to 0 1 2 3 4 5 number of sources ourextendedpower-law(9)inthecaseofthe30GHzchan- nel.Itisclearthattheextendedpower-lawfitsverywellthe Figure2. Negativelog-posterioragainstthenumberofdetected counts predicted by the de Zotti model. The value of χ2 is sources for a simulation at 30 GHz. We have included in the (2−3)×10−3 giving probabilities very close to 1. posterior: all the priors (blue solid line), all the priors but the sourcefluxdistribution(reddash-dotted line),allthe priorsbut the Poisson source number distribution (cyan dashed line) and 3.2 Discussion on the performance of the finally we have excluded both the Poisson and the source flux algorithm distribution(greendotted line). For each simulation we calculate M = a2φtξ−1φ and e = 0 a0φtξ−1d.Weobtainξ−1fromtheWMAPCℓ’s,takinginto 30 GHz account the effects of the pixel and the beam windows and 0.9 the corresponding noise levels for each channel. We calcu- 0.8 late the estimated fluxes xˆα by solving (13) as explained in the previous section: we select the maxima of e above 0.7 a certain threshold (we choose a 1σ threshold so that we have a suitable number of peaks) and we perform a top to bility0.6 a bottom strategy, i.e. we sort the local maxima downwards b0.5 o fpreoamkwhieghsoelrveto(1l3o)wienrclvuadluinegs iannedacshtanrteiwngitferroamtiotnheanheigwhelost- osterior pr0.4 calmaximum.Atthesametime,wecalculate(12)andstop p0.3 the iterations when we find the minimum value of the neg- 0.2 ative log-posterior. In this way, we obtain the source fluxes andthenumberof sourcesthat maximize thelog-posterior. 0.1 We also calculate the local peaks of e above a 5σ thresh- 0 old, a standard detection method, and calculate the source 0 1 2 3 4 number of sources flux by solving (13) with γ = 0, this is equivalent to us- ing a MF with a 5σ threshold. Our intention is to compare Figure3. Posteriorprobabilityagainstnumberofdetectedpoint the Bayesian method (BM), with prior information and a sources for a simulation at the 30 GHz channel with one real natural stopping criterion, and thestandard MF. source. According to our simulations, the fundamental contri- butions to the posterior come from the likelihood (7) and theprioron sourcelocations (8).Theother termsalso con- answer can be seen in Figure 3, where we show, as an ex- tribute, but as can be seen in Figure 2, where we show the ample, the normalized posterior probability plotted against negativelog-posteriorplottedagainstthenumberofsources the number n of point sources for a given simulation with for a particular simulation, their influence is not so impor- onerealsource.Theprobabilityisclearlypeakedattheesti- tant.Thelikelihoodtendstoincreasethenumberofdetected matednumberofsources, whichistherealnumberofpoint sources, over-fittingthe data and the prior (8) tends to de- sources in this case. In our simulations we observe that the crease the number of sources. The combination of (7) and posterior probability is always strongly peaked around the (8)fixesthemostprobablenumberofpointsources,though estimated numberof sources. theothertwoterms,althoughlessimportantcanhavesome Weanalyze 1000 simulations foreach oftheconsidered influence.Thisshowstherobustnessofthemethodwithre- Planck channels: we calculate the contamination (thenum- spect to small changes in the parameters of priors (10) and ber of detected spurious sources over the total number of (11). detected sources above a given flux), thecompleteness (the Wealso raisethequestion whethertheestimated num- number of real detected sources over the number of simu- berofpointsourcesgivesusaclearlyhigherposteriorprob- lated sourcesaboveagivenflux)andtheaverageoftheab- ability, i.e ∝ exp(−logL) than other close numbers. The solute value of the relative error of the estimated flux with A Bayesian technique for the detection of point sources in CMB maps 7 100 GHz 100 GHz 0.08 0.07 BM 1 0.06 MF 0.8 ation0.05 ness n e ami0.04 plet0.6 BM ont om MF C0.03 C 0.4 0.02 0.2 0.01 0 0 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 Flux (Jy) Flux (Jy) Figure 4. Contamination plotted against the flux for the BM Figure5. CompletenessplottedagainstthefluxfortheBMand andtheMF(100GHz). theMF(100GHz). respect to the real flux (reconstruction error). We count a 100 GHz detectedsourceasrealwhenthereisarealsimulatedsource 1 at a distance no longer than two pixels from the detected one,thisdistanceisthepositionerror.Thisrealsourcemust 0.9 have a flux equal or higher than 0.20 Jy, to fix a threshold close to the1σ level of theCMB plusnoise map. The same 0.8 conditions are required for theMF. Inordertogivetheuncertaintyintheflux,derivedfrom ness0.7 e ourBayesian approach,we will obtain a 95% confidencein- plet terval associated to our probability distribution Com0.6 BM MF P(x)∝exp(−L(x)), (18) 0.5 where L(x) is given by eq. 14. This will be called the esti- 0.4 mationerror.Wecanalsocalculatetheexpectationvalueof the flux from this distribution, this value can be compared with themost probable value (i.e. our estimated flux). 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 Contamination Figure 6. Completeness plotted against contamination for the 3.3 Results BMandtheMF(100GHz). Taking into account the considerations above, we have ap- plied our algorithm to the simulations described in 3.1 and obtained thefollowing results. (BM) and the completeness is 100%. For the MF there are At the 30 GHz channel we have 4.9% contamination no spurious sources from 0.30 Jy on, but the completeness above0.2JywiththeBM,and0.7%withtheMF.However, at 0.8 Jy is only 70%. As in the 30 GHz case, we obtain the completeness is much better for the BM, 64%, than for similar average values of the absolute value of the relative the MF, 22%. From 0.7 Jy on we do not have any spurious errorforbothmethods.Forinstance,weobtainerrorsbelow source(BM)andthecompletenessis99%.FortheMFthere 15% from 0.90Jyonin bothcases. 15% ofthesourceshave are no spurious sources from 0.25 Jy on, but the complete- a reconstruction error on the position of 1 pixel and only ness at 0.7 Jy is only 75%. In regard to the average value 3% have a higher error. These results are nearly the same of the absolute value of the relative error (reconstruction for theBM and theMF. error), when we calculate this error in flux intervals of 0.1 Atthe70GHzchannelwehave3.2%ofspurioussources Jy, we obtain similar values for the BM and the MF. For forfluxeshigherthan0.2Jywith theBM,and1%with the instance, we obtain errors below 15% from 0.6 Jy on and MF. The completeness is 45% for the BM and 19% for the below10%from1Jyonforbothmethods.Only14%ofthe MF. From 0.45 Jy on we do not have any spurious source sourceshaveareconstructionerroronthepositionof1pixel (BM) and the completeness is 96%. For the MF there are and only 3% have a higher error. These results are nearly no spurious sources from 0.30 Jy on, but the completeness thesame for theBM and theMF. at 0.45 Jy is only 46%. As in the cases above, we obtain Atthe44GHzchannelwehave6.5%contaminationfor similar average values of the absolute value of the relative fluxes higher than 0.2 Jy with the BM, and 4% with the errorforbothmethods.Forinstance,weobtainerrorsbelow MF. The completeness is 37% for the BM and 11% for the 15% from 0.60 Jy on in both cases. 9% of the sources have MF. From 0.8 Jy on we do not have any spurious source a reconstruction error on the position of 1 pixel and only 8 Argu¨eso et al. 100 GHz 100 GHz 0.25 5 4.5 0.2 BM 4 MF 3.5 0.15 ux (Jy) 3 |error| ated Fl2.5 0.1 Estim 2 1.5 0.05 1 0.5 0 0 0.5 1 1.5 2 0 1 2 3 4 5 Flux (Jy) Real Flux (Jy) Figure7. Averagevalueoftheabsolutevalueoftherelativeerror Figure8. EstimatedfluxagainstrealfluxfortheBM(100GHz). plotted against the flux forthe BMand the MF(100 GHz). We Wehaveplottedthestraightliney=xforcomparison. canseeintheplotthelowvaluesoftheerrorforbothmethods. ratesthreepriordistributions:auniformdistribution(8)on 1%haveahighererror.TheseresultsaresimilarfortheBM the source locations, an extended power law on the source and the MF. fluxes(10)andaPoissondistributiononthenumberofpoint At the 100 GHz channel we have 4.7% of spurious sourcesperpatch(11).TogetherwithaGaussianlikelihood, sourcesforfluxeshigherthan0.2JywiththeBM,and7.5% thesepriors producethe negative log-posterior (12). withtheMF.Thecompletenessis89%fortheBMand32% We minimize this negative log-posterior with respect for the MF. From 0.3 Jy on we do not have any spurious tothesource fluxesin orderto estimate them.At thesame source (BM) and the completeness is 100%. For the MF time,weshowthatthedetectedsourcesmustbeinthepeaks therearenospurioussources from 0.75 Jy on and thecom- of the matched-filtered maps. Finally, we choose the num- pletenessat0.75Jyis94%.Asinthecasesabove,weobtain berofpoint sourceswhich minimizes(12)fortheestimated similar average values of the absolute value of the relative fluxes. error for both methods. For instance, we obtain errors be- Inthisway,wegiveanon-arbitrarymethodtoselectthe low10%from0.5Jyinbothcases.1%ofthesourceshavea number of point sources. Finally, to check the performance reconstructionerrorontheposition of1pixelandthereare ofthistechnique,wecarryoutflatCMBsimulationsforthe nosources with a higher error. These results are similar for Planckchannelsfrom30to100GHz.Forsimplicity,wehave theBM and theMF. excluded theforegrounds in our simulations, assuming that In order to visualize these results, we have plotted for weareconsideringzonesoftheskywhichhavebeencleaned the 100 GHz channel the contamination (integrated con- bytheapplication of component separation methods. How- tamination) against the flux in Figure 4, the completeness ever,wehaveincludedtheconfusionnoiseduetounresolved (integrated completeness) against the flux in Figure 5, the point sources in our simulations. completenessagainstthecontaminationinFigure6,theav- WecompareourBayesianstrategywiththeapplication erage value of the absolute value of the relative error in ofamatched filterwith astandard 5σ threshold.Wecalcu- Figure7andfinally,theestimatedfluxagainst therealflux late the contamination, the completeness and the relative in Figure 8. Itis clear that for a given valueof thecontam- error for both methods. Though the percentage of spurious ination the completeness is higher for the BM than for the sourcesisalittlehigherfortheBMatlowfluxes≃0.2−0.3 MF. Although the results are similar at all the studied fre- Jy, the completeness is much better, allowing us to obtain quencies, we havechosen the100 GHzchannelin order not catalogueswitha99%completenessandnospurioussources to complicate unnecessarily thefigures. from 0.7 Jy (30 GHz), 0.8 Jy (44 GHz), 0.55 Jy (70 GHz) In Figure 9 we plot the expectation value of the flux and 0.3 Jy (100 GHz) on. The reconstruction errors in the against the estimated flux for the 100 GHz channel. The estimated fluxesare similarly low for both methods. expectation value is nearly the same as the most probable value. We also plot the 95% confidence intervals. In this way, we have an idea of the uncertainty of our estimates, thisconfidenceintervalis≃0.20 Jy(estimation error).The ACKNOWLEDGMENTS results at other frequencies are similar. The authors acknowledge partial financial support from the Spanish Ministerio de Ciencia e Innovaci´on project AYA2007-68058-C03-02 and from the joint CNR-CSIC re- 4 CONCLUSIONS searchproject2008-IT-0059.KKwassupportedbytheItal- In thispaper we propose a newstrategy based on Bayesian ian Space Agency, ASI, under the program on Cosmol- methodology (BM), that can beapplied to the blind detec- ogy and Fundamental Physics, and by the TRIL program tion of point sources in CMB maps. The method incorpo- at the Abdus Salam International Centre of Theoretical A Bayesian technique for the detection of point sources in CMB maps 9 100 GHz Healey S. E., Romani R. W., Taylor G. B., Sadler E. M., Ricci R., Murphy T., Ulvestad J. S., Winn J. N., 2007, 1.1 ApJS,171, 61 1 Herranz D., Sanz J. L., 2008, IEEE Journal of Selected Topics in Signal Processing, 5, 727 0.9 n Flux (Jy)00..78 HJRe.rArMaSn,.,z3M3D6a.,,r1St´ı0an5ne7zz-JG.oLn.,zH´aloezbsEon.,MLa.sPe.n,bByarAre.irNo.,R2.0B0.2,,DMieNgo- atio0.6 HerranzD.,VielvaP., 2010, IEEESignal Processing Mag- pect0.5 azine, 27, 67 Ex0.4 Hobson M. P., McLachlan C., 2003, MNRAS,338, 765 0.3 Lanz L. F., Herranz D., Sanz J. L., Gonz´alez-Nuevo J., L´opez-Caniego M., 2010, MNRAS,403, 2120 0.2 LeachS.M.,CardosoJ.-F.,BaccigalupiC.,BarreiroR.B., 0.1 BetouleM.,BobinJ.,BonaldiA.,DelabrouilleJ.,deZotti 0.2 0.4 0.6 0.8 1 1.2 Estimated Flux (Jy) G.,DickinsonC.,and20coauthors,2008, A&A,491, 597 L´opez-Caniego M., Gonz´alez-Nuevo J., Herranz D., Mas- Figure 9. Expectation valueofthefluxagainstestimatedflux. sardi M., Sanz J. L., De Zotti G., Toffolatti L., Argu¨eso The95%confidence intervalsarealsoplotted(100GHz). F., 2007, ApJS, 170, 108 MahonyE., EkersR.,Massardi M., MurphyT., SadlerE., Sadler 2010, in IAU Symposium Vol. 267 of IAUSympo- sium, The Australia Telescope 20 GHz (AT20G) Survey. Physics, through a specific collaboration agreement with pp264–264 ISTI-CNR. ES and EEK acknowledge support from ASI Massardi M., Ekers R. D., Murphy T., Ricci R., Sadler through ASI/INAF Agreement I/072/09/0 for the Planck E. M., Burke S., de Zotti G., Edwards P. G., Han- LFIActivityofPhaseE2.WealsothankJ.Gonz´alez-Nuevo cock P. J., Jackson C. A., Kesteven M. J., Mahony for useful comments and help. We wish also to thank the E., Phillips C. J., Staveley-Smith L., Subrahmanyan R., referee for his comments that have helped us to improve WalkerM. A., Wilson W.E., 2008, MNRAS,384, 775 significantly thequality of this work. Nailong W., 1992, in Worrall D. M., Biemesderfer C., BarnesJ.,eds,AstronomicalDataAnalysisSoftware and Systems I Vol. 25 of Astronomical Society of the Pacific ConferenceSeries,UsingamatchedfiltertoimproveSNR of radio maps.. pp 291–293 REFERENCES NoltaM.R.,DunkleyJ.,HillR.S.,HinshawG.,Komatsu Argu¨eso F., Gonz´alez-Nuevo J., Toffolatti L., 2003, ApJ, E., Larson D., Page L., Spergel D. N., Bennett C. L., 598, 86 Gold B., Jarosik N., Odegard N., Weiland J. L., Wollack CarvalhoP.,RochaG.,HobsonM.P.,2009,MNRAS,393, E.,HalpernM.,KogutA.,LimonM.,MeyerS.S.,Tucker 681 G. S., Wright E. L., 2009, ApJS,180, 296 Clements D. L., Rigby E., Maddox S., Dunne L., Mortier Ricci R., Sadler E. M., Ekers R. D., Staveley-Smith L., A., Pearson C., Amblard A., Auld R., Baes M., Bonfield WilsonW.E.,KestevenM.J.,SubrahmanyanR.,Walker D., and 33 coauthors, 2010, ArXive-prints M. A., Jackson C. A., De Zotti G., 2004, MNRAS, 354, de Zotti G., Massardi M., Negrello M., Wall J., 2010, As- 305 tron. Astrophys.Rev.,18, 1 RiderP. R., 1957, Ann.Inst. Stat. Math., 9, 215 deZotti G., Ricci R.,Mesa D.,SilvaL., Mazzotta P., Tof- Sanz J. L., Herranz D., L´opez-Caniego M., Argu¨eso F., folatti L., Gonz´alez-Nuevo J., 2005, A&A,431, 893 2006, in Proceedings of the 14th European Signal Pro- DevlinM.J.,AdeP.A.R.,AretxagaI.,BockJ.J.,Chapin cessing Conference (2006). EUSIPCO 2006 Conference, E. L., Griffin M., Gundersen J. O., Halpern M. Hargrave Waveletsonthesphere.Applicationtothedetectionprob- P.C.HughesD.H.,and19coauthors,2009, Nature,458, lem. pp 1–5 737 Tauber J. A., 2005, in Lasenby A. N., Wilkinson A., eds, Eales S., Dunne L., Clements D., Cooray A., de Zotti G., NewCosmologicalDataandtheValuesoftheFundamen- DyeS.,IvisonR.,JarvisM.,LagacheG.,MaddoxS.,and tal Parameters Vol. 201 of IAU Symposium, The Planck 89 coauthors. 2010, Publications of the Astronomical So- Mission. pp 86–+ ciety of thePacific, 122, 499 TaylorA.C.,GraingeK.,JonesM.E.,PooleyG.G.,Saun- Feroz F., Hobson M. P., 2008, MNRAS,384, 449 dersR. D.E., Waldram E. M., 2001, MNRAS,327, L1 Gonz´alez-Nuevo J., Argu¨eso F., L´opez-Caniego M., Toffo- Toffolatti L., Argu¨eso Gomez F., de Zotti G., Mazzei P., lattiL.,SanzJ.L.,VielvaP.,HerranzD.,2006,MNRAS, Franceschini A., Danese L., Burigana C., 1998, MNRAS, 369, 1603 297, 117 Gonz´alez-Nuevo J., Massardi M., Argu¨eso F., Herranz D., Waldram E. M., Pooley G. G., Grainge K. J. B., Jones Toffolatti L., Sanz J. L., L´opez-Caniego M., de Zotti G., M.E.,SaundersR.D.E.,ScottP.F.,TaylorA.C.,2003, 2008, MNRAS,384, 711 MNRAS,342, 915 Gonz´alez-Nuevo J., Toffolatti L., Argu¨eso F., 2005, ApJ, 621, 1

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.