Astronomy&Astrophysicsmanuscriptno.milcann (cid:13)c ESO2017 February2,2017 MILCANN : A neural network assessed tSZ map for galaxy cluster detection G.Hurier1,N.Aghanim2 &M.Douspis2 1 CentrodeEstudiosdeFísicadelCosmosdeAragón(CEFCA),PlazadeSanJuan,1,planta2,E-44001,Teruel,Spain 2 Institutd’AstrophysiqueSpatiale,CNRS(UMR8617)UniversitéParis-Sud11,Bâtiment121,Orsay,France e-mail:[email protected] Received/Accepted 7 1 Abstract 0 2 WepresentthefirstcombinationofthermalSunyaev-Zel’dovich(tSZ)mapwithamulti-frequencyqualityassessmentoftheskypixels basedonArtificialNeuralNetworks(ANN)aimingatdetectingtSZsourcesfromsub-millimeterobservationsoftheskybyPlanck. n Weconstructanadaptedfull-skyANNassessmentonthefullskyandwepresenttheconstructionoftheresultingfilteredandcleaned a tSZmap,MILCANN.Weshowthatthiscombinationallowstosignificantlyreducethenoisefluctuationsandforegroundresiduals J comparedtostandardtSZmaps.FromtheMILCANNmap,weconstructedtheHADtSZsourcecatalogthatconsistsof3969sources 1 withapurityof90%.Finally,Wecomparethiscatalogwithancillarycatalogsandshowthatthegalaxy-clustercandidatesintheHAD 3 catalogareessentiallylow-mass(downtoM =1014M )high-redshift(uptoz≤1)galaxyclustercandidates. 500 (cid:12) Keywords.Cosmology:Observations–Cosmicbackgroundradiation–Sunyaev-Zel’dovicheffect ] O C . 1. Introduction Neglectingrelativisticcorrectionswehave h p Galaxy clusters are the largest virialized structures in the (cid:20) (cid:18)x(cid:19) (cid:21) g(ν)= xcoth −4 , (3) o- Universe. They are excellent tracers of the matter distribution, 2 and their abundance can be used to constrain the cosmological tr modelinanindependentway.Galaxyclustersareobjectscom- with x = hν/(kBTCMB). At z = 0, where TCMB(z = as posed of dark matter, stars, cold gas and dust in galaxies, and 0) = 2.726±0.001 K, the tSZeffect is negative below 217GHz [ a hot ionized intra-cluster medium (ICM). Consequently, they andpositiveforhigherfrequencies. can be identified in optical bands as concentrations of galax- 1 ies (see e.g. Abell et al. 1989; Gladders & Yee 2005; Koester tSZsurveysprovidesadifferentwindowontheclusterpop- v etal.2007;Rykoffetal.2013),theycanbeobservedinX-rays ulation compared to other large scale structure tracers, allow- 5 by the bremsstrahlung emission produced by the ionized intra- ingtodetecthigherredshiftobjects.Theyalsoprovideanearly 7 0 cluster medium (ICM) (see e.g. Bohringer et al. 2000; Ebeling mass-limited census of the cluster population at high redshift, 0 etal.2000,2001;Böhringeretal.2001).ThesamehotICMalso where abundance is strongly sensitive to cosmological param- 0 creates a distortion in the black-body spectrum of the cosmic eters (Carlstrom et al. 2002; Planck Collaboration results. XX 2. microwave background (CMB) through the thermal Sunyaev- 2014).ThetSZintegratedComptonisationparameter,Ysz,isre- Zel’dovich (tSZ) effect (Sunyaev & Zeldovich 1969, 1972), an latedtotheintegratedelectronpressureandhencethetotalther- 0 7 inverse-Compton scattering between the CMB photons and the malenergyoftheclustergas.Itcorrelateswithmasswithalow 1 ionizedelectronsintheICM. intrinsic scatter and small dependence on the dynamical prop- : ThroughthetSZeffect,CMBphotonsreceiveanaverageen- erties of the cluster (eg: da Silva et al. 2004; Kay et al. 2012; v ergyboostbycollisionwithhot(afewkeV)ionizedelectronsof Hoekstra et al. 2012; Planck Collaboration et al. 2013; Sifón i X the intra-cluster medium (see e.g. Birkinshaw 1999; Carlstrom etal.2013). r etal.2002,forreviews).ThethermalSZComptonparameterin Recent large catalogues based on measurements of the tSZ a agivendirection,n,ontheskyisgivenby effect have been produced from Planck (Planck Collaboration early results. VIII 2011; Planck Collaboration results XXXII (cid:90) k T 2015),ACT(Marriageetal.2011),andSPT(Bleemetal.2014) y(n)= n B eσ ds (1) em c2 T data. e Several tSZ sources detection algorithm (see e.g., Melin et al. wheredsisthedistancealongtheline-of-sight,n,andn andT 2006; Carvalho et al. 2009) have been proposed and compared e e aretheelectronnumberdensityandtemperature,respectively.In (Melin et al. 2012), demonstrating that a multi-frequency unitsofCMBtemperaturethecontributionofthetSZeffectfor approach for the tSZ detection is more robust than a tSZ agivenobservationfrequencyνis map-based approach. Especially, tSZ map-based detection methods present a significantly lower level of purity then ∆T multi-frequency-basedapproaches. CMB =g(ν)y. (2) T CMB 1 G.Hurier,N.Aghanim&M.Douspis:MILCANN:AneuralnetworkassessedtSZmapforgalaxyclusterdetection The expected number of galaxy clusters is extremely • Ancillarydatausedtocharacterizegalaxyclustercandidates: sensitivetocosmologicalparameters(TheDarkEnergySurvey – the reprocessed IRAS maps, IRIS (Improved Collaboration2005;Vanderlindeetal.2010;Sehgaletal.2011; ReprocessingoftheIRASSurvey,Miville-Deschênes& Böhringer et al. 2014; Planck Collaboration results. XX 2014). Lagache2005), Animportantfactorthatlimitstheaccuracyinthedetermination – TheAllWISESourceCatalog1 of cosmological parameters is the contamination of the tSZ signalbyotherastrophysicalemissions,mainlyradio,infra-red 3. Artificialneuralnetworkfiltering pointsources,andcosmicinfra-redbackground(Dunkleyetal. 2011;Shirokoffetal.2011;Reichardtetal.2012;Sieversetal. In this section, we describe how we build an artificial neural 2013;PlanckCollaborationresults.XXI2014)thatcanproduce network (ANN) filtered tSZ map. ANN-based quality assess- spurious galaxy cluster detections and/or a significant bias the ment enables to directly learn from the data the characteristic measured tSZ fluxes. A tailored quality assessment of the tSZ signature of real tSZ sources and spurious signal using a signal from galaxy clusters is thus required to produced high- reference sample of astrophysical sources. This approach has purityrobustgalaxyclustersamplesdetectedwiththetSZeffect. been shown as an efficient method to identify spurious tSZ detections(Aghanimetal.2015;vanderBurgetal.2016)from An artificial neural network (ANN) based quality assess- Planck tSZ candidate catalogs (Planck Collaboration results ment method has been proposed (Aghanim et al. 2015) and XXXII 2015; Planck Collaboration 2015 results XXVII 2016). appliedtoPlanckgalaxyclustercatalogs(PlanckCollaboration In the following, we extend this approach to each pixel of the 2015 results XXVII 2016). This method uses the Planck sky-mapsratherthantoapre-selectedsampleoftSZcandidates. multi-frequency data to assess the quality of the tSZ signal by decomposing the measured signal into the different astrophysi- calcomponentsinterveningatthePlanckfrequencies.Itshowed that the first Planck tSZ-source catalog (Planck Collaboration 3.1. NeuralNetworkTraining resultsXXXII2015)suffersfromcontaminationbygalacticCO WesummarizeherethekeyelementsoftheANNtrainingpro- sources and infra-red emission (Aghanim et al. 2015). Recent cessandrefertoAghanimetal.(2015)foramoredetailedde- follow-up (van der Burg et al. 2016) of Planck tSZ sources in scription. theopticalhaveshowntheefficiencyofthisANN-basedquality assessment by confirming the spurious nature of tSZ sources withbadqualitycriteria. 3.1.1. SEDfitting In this work, we propose a new approach for galaxy clus- ter detection, that uses aN ANN-based filtering of tSZ full-sky maps.First,insection2,wepresentthedifferentdatasetusedin this work. In section 3 we detail the construction of the ANN- basedfilterthatenablestoperformtSZsourcesdetectioninthe tSZ-map space competitively with multi-frequency approches. Then,insection4,weconstructasampleagalaxyclustercan- didates and we present a detailed characterization of this new galaxyclusterdetectionmethod.Finally,insect.5weperforma multi-wavelengthassessmentofthedetectedgalaxyclustercan- didates. 2. Data Figure1.Matchedfilterin(cid:96)spaceusedfortheANN-basedqual- Forthepresentanalysisweusedseveraldataset: ityassessment. • Planck intensity maps from 70 to 857 GHz (Planck Collaboration 2015 results I 2016), assuming spectral re- FollowingAghanimetal.(2015),wefocusontheastrophys- sponse from Planck Collaboration results IX (2014) and icalemissionsthataffectthemostthetSZdetectioninmultifre- gaussianbeams(PlanckCollaborationresults.VII2014). quency experiments. This issue was addressed in both Planck • Planck full-sky lensing map (Planck Collaboration et al. Collaboration et al. (2011) and Planck Collaboration results 2015), XXXII(2015).Wemodelthespectralenergydistribution(SED) • Catalogsusedtodescribeastrophysicalsourcesproperties: takingintoaccountfivecomponents:thethermalSZ(tSZ)effect – ametacatalogofX-raydetectedgalaxycluster(MCXC, neglectingrelativisticcorrections;theCMBsignal;andtheCO Piffarettietal.2011,andreferencetherein), emission. We also add an effective IR component representing – theWHL12catalogue(132,684objects,Wenetal.2012) the contaminationby dustemission, coldGalactic sources, and ofgalaxyclustersdetectedintheSDSSdataYorketal. CIB fluctuations; and an effective radio component accounting 2000, for diffuse radio and synchrotron emission and radio sources. – a catalog of tSZ sources detected in Planck data (PSZ1 Thefluxineachchannel,i.e.frequency,isthenwrittenas hereafter, see Planck Collaboration results XXXII 2015). F =A F (ν)+A F (ν)+A F (ν) ν SZ SZ CMB CMB IR IR – catalogues of sources detected at 30 GHz and 353 GHz +A F (ν)+A F (ν)+N(ν), (4) RAD RAD CO CO fromthePlanckCatalogueofCompactSources(PCCS) (PlanckCollaborationresultsXXVIII2014) 1 http://wise2.ipac.caltech.edu/docs/release/allwise/expsup/ 2 G.Hurier,N.Aghanim&M.Douspis:MILCANN:AneuralnetworkassessedtSZmapforgalaxyclusterdetection where F (ν), F (ν), F (ν), F (ν), and F (ν) are the ThuswederiveanSED, F ,foreachpixels.whichisfitted SZ CMB IR RAD CO ν spectraoftSZ,CMB,IR,radio,andCOemissions; A , A , assuming the model in Eq. 4, where we fit for A , A , A , SZ CMB SZ CMB IR A ,A ,andA arethecorrespondingamplitudes;andN(ν) A ,andA throughalinearfitoftheform IR RAD CO RAD CO istheinstrumentalnoise. A=(FTC−1F)−1FTC−1F , (6) N N ν For F (ν), we consider a modified black-body spectrum withtempIeRratureT = 17Kandindexβ = 1.6.Thisassump- with the mixing matrix FT, the instrumental noise covariance d d tion is representative of the dust properties at high galactic lat- matrix CN, and A a vector containing the fitted parameters. In itudes. The contribution from CIB fluctuations affects the flux thisapproach,CN onlyaccountsfortheinstrumentalnoise,and measurement, but is not a major contamination from the point weimplicitlyassumethatthefivecomponentsconsideredinthe ofviewofthedetection,i.e.spurioussources.For F (ν),we modelreproducetheastrophysicalsignalinthedata. considerapowerlawemission,ναr,withindexαr =R−A0D.7inin- Theefficiencyofthedimensionalityreduction2 isillustrated tensity units representative of the average property of the radio inFig.2(rightpanel),whereweshowthecorrelationmatrixof emission.InordertocomputetheSEDoftheconsideredpixels, the fitted SED parameters, compared to the correlation matrix we use the Planck frequency maps from 70 to 857 GHz. First, ofmeasuredfluxesfrom70to857GHz(leftpanel).Weobserve eachmapissettoaresolutionof13arcmin,i.e.thelowestreso- that we have a high degree of correlation between frequencies, lution(70GHzmap). especially at low frequency due to the CMB component (< 217 GHz), and at higher frequency (> 217 GHz) due to the thermalcomponent.Incontrast,intheSEDparameterspace,we observe that the correlation matrix is almost diagonal, except foraspatialcorrelationbetweenthermaldustandCOemission. Thiscorrelationistheconsequenceofthethermaldustemission from molecular clouds, and is thus physically motivated. We alsonoteasignificantcorrelationbetweenthetSZeffectandthe radio emission. This correlation is produced by the similarities of tSZ and radio SEDs in Planck low-frequency channels that dominates the tSZ flux estimation. Indeed, the tSZ effect is negativeatlow-frequencyandconsequentlyintermsofbest-fit ahightSZfluxcanbecompensatedbyastrongradioemission, Figure2. Left panel: correlation matrix of the measured fluxes which induces a high degree of correlation between tSZ and from 30 to 857 GHz estimated on 2000 random positions over radioemissionfluxes. thesky.Rightpanel:correlationmatrixoffittedSEDparameters We also observe that the match-filtering approach for the SED fromthesamepositions. estimation enables to obtain a correlation matrix for SED parameters which is more diagonal, especially for the tSZ amplitude A , than an aperture photometry SED estimate (see SZ By contrast to Aghanim et al. (2015), we compute the flux Fig.2inAghanimetal.2015,forcomparison). foreachpixelandeachfrequencyusingamatch-filterin(cid:96)space presented in Fig. 1. We choose this approach, rather than the aperturephotometrytoimprovethephotometryoftSZsources. To build this filter, we assume the power spectrum, y , of a 3.1.2. Full-skyANN-basedqualityassessment (cid:96) tSZ signal from a cluster with R = 5(cid:48) (point-like with re- 500 We consider a standard three-layer back-propagation ANN to spect to the Planck experiment), and a universal pressure pro- separate pixels of the sky maps into three populations of reli- file (Arnaud et al. 2010). We also consider the power spectrum able quality (the Good), unreliable quality/false (the Bad), and of the CMB,CCMB, computed using Planck best-fit cosmology (cid:96) noisysources(theUgly).Theinputsoftheneuralnetworkcon- (Planck Collaboration 2015 results XIII 2015), and the power sist of the five SED parameters and the outputs represent the spectrumofthenoise,CNN,inthe100GHzchannel(estimated (cid:96) threeclassesusedtoclassifyeachskypixel. using half-ring map difference) which is the frequency channel Webrieflypresentherethebasicsoftheartificialneuralnetwork with the highest noise level. We note that to obtain model in- method.Wedefine dependent flux estimations, the match-filter has to be the same at all frequency. Considering that for relevant frequency chan- Q=g(W g(W (W F +b)+b )+b ), (7) o h r ν r h o nelsforthetSZfluxestimation(100to217GHz),angularscales ((cid:96)∈[1000,2000]),andgalacticlatitudesthethermaldustampli- whereg(x) = 1/(1+exp(−x))istheactivationfunction,W = r tudefromtheMilkyWayissmallcomparedtotheCMBcontri- (FTC−1F)−1FTC−1 corresponds to a physically-based dimen- N N bution,wedonotincludeagalacticthermaldustcontributionin sionalreduction,W aretheweightsbetweeninputandhidden h thecomputationofthematchfilter.Thenthefilteriscomputed layers, W are the weights between hidden and output layers, o as b arethebiasesbetweeninputandhiddenlayers,andb arethe h o biasesbetweenhiddenandoutputlayers. (cid:115) F = y(cid:96) , (5) To train the neural network, we use the same sample for (cid:96) CCMB+CNN Good, Bad, and Ugly classes as in Aghanim et al. (2015). We (cid:96) (cid:96) 2 TheprojectionofthemeasuredSEDonabasisofphysicallymo- andappliedontheharmonicspacecoefficients,a∗ = F a , (cid:96),m (cid:96) (cid:96),m tivatedSEDsallowstoreducethenumberofparametersthatdescribe ofPlanckintensitymapsfrom70to857GHz.Weverifiedthat themeasuredSED.Thismethodcanbecomparedtoprincipalcompo- ourresultsdonotsignificantlydependsonthechosenamplitude nentanalysis,usingamodel-basedbasisinsteadofeigenvectorofthe ofCNNforthematch-filtercomputation. observations. (cid:96) 3 G.Hurier,N.Aghanim&M.Douspis:MILCANN:AneuralnetworkassessedtSZmapforgalaxyclusterdetection Figure3. From top to bottom: Planck tSZ MILCA map, neural network filter, MILCANN map, for the full-sky in mollweide projection.Greyregionsaremaskedduetogalacticforegroundsorpointsourcescontamination. 4 G.Hurier,N.Aghanim&M.Douspis:MILCANN:AneuralnetworkassessedtSZmapforgalaxyclusterdetection Figure4. From left to right: Planck tSZ MILCA map, neural network filter, MILCANN map, for a patch of 8.5 × 8.5 degrees in gnomonicprojectioncenteredongalacticcoordinates(l,b)=(263o,-24o).Greyregionsaremaskedduetopointsourcescontamina- tion. adaptthetheANNtothematch-filterSEDestimation.Thus,we emission.Radiopointsourcesproduceanegativesignalinterms usethevalueinthecentralpixeltowardeachsourceasaproxy of Compton parameter, thus they can be easily separated from ofthefluxoftheconsideredsource.Wesplitthetrainingsample a real tSZ-structure (Hurier et al. 2013). Significant CO emis- into two sub-samples, the first one used to train the ANN, and sion is essentially located at low galactic latitudes, and can be thesecondoneusedtopreventover-trainingoftheANN. maskedforgalaxyclustersdetectionpurposes.CIBemissionis Wedefinedtheerrorontheclassificationas more complex to deal with, as it is homogeneously distributed overthesky.InSect.3.2.3,wewillpropagateCIB-inducedun- 1 (cid:88) E = (Q(true)−Q )2, (8) certainties. 2 class class class 3.2.2. MILCANNmap where class stands for Good, Bad, or Ugly and Q(true) = 1 or class 0 depending on whether the source belongs to the considered On Fig. 3, we observed that the tSZ map reconstructed from class. Planckdatasuffersfrombiasduetoresidulasfromotherastro- physical emission. Using the neural-network based quality as- sessment presented in sect. 3.1.2, we can obtain an estimation 3.2. tSZqualityassessedmap of the quality of the tSZ signal in each line-of-sight of the sky. 3.2.1. MILCAPlancktSZmap First,wedefineanANN-basedfilter,QN,as We perform the construction of a tSZ map with the MILCA Q = Q (1−Q ), (9) N GOOD BAD method (Hurier et al. 2013), using Planck HFI from at 100 to 857 GHz. We verified that including frequencies from 30 to WhereQGOOD andQBAD aretheANNclassificationoutputval- 70 GHz does not change significantly the reconstructed map, ues for the Good and Bad classes. By construction this ANN- especiallyatgalaxyclusterscales. filter ranges from 0 to 1, with values close to 1 for pixels that WeperformedtheconstructionofthetSZmapusing8filtersin presentahigh-qualitytSZsignal.Wedidnotapplyanyweight- spherical harmonic space. For the first three filters we used 2 ingovertheUglyANNoutputtoavoidacatastrophicreduction constraints(tSZandCMB), andforthelastfivefilterswe only oflowsignaltonoiseratiogalaxyclusters.Thenoiseamplitude used a constraint on the tSZ SED. The map reconstruction is reductionisalreadyperformedbytheweightingoverQGOOD. performedwithaneffectiveFWHMof7arcmin.Forallfilters, In Fig. 3 and 4, we show the full-sky ANN filter described in 2 degrees of freedom have been used to minimize the variance Sect. 3.1.2 and the MILCANN map. We observe a clear spa- ofthenoise(seeHurieretal.2013,foradetaileddescriptionof tial correlation between the ANN filter and the tSZ MILCA theMILCAmethod). map. However, we observe that the ANN filter presents values close to 1 even for some pixels where no clear tSZ excess can In Fig. 3, we show the MILCA full-sky map at 7 arcmin beobserved.Thesemisclassifiedpixelsareproducedbychance FWHM. Figure 4 shows a zoom on a small region of 8.5×8.5 alignement between noise structure and tSZ spectral signature. degrees where bright galaxy clusterS can be observed. In the Asaconsequence,atSZsourcesdetectionperformeddirectlyon full-sky map, we observe a significant amount of foreground theANNfiltermapwouldleadstolow-puritysamples. residuals near the galactic plane, where synchrotron and free- Then,weconvolvetheMILCAtSZmap,noted(cid:98)y,bythematched free residuals appear as negative biases in the tSZ y-map sig- filterusedfortheSEDfittinginsection3.1.1.Thus,thefiltered nal. We also observe contamination by bright galactic cirrus MILCAmap,(cid:98)yf,hasatransfertfunctionconsistentwiththemap correlated with the zodiacal light. As demonstrated in previous usedtoperformtheANNclassification. works (Hurier et al. 2013; Planck Collaboration 2015 results Finally,wederivedtheMILCANNmapas XXII2015),themainsourcesofcontaminationintSZmapsbuilt fromPlanckintensitymapsareradiopointsources,CO,andCIB MMILCANN =(cid:98)yfQN. (10) 5 G.Hurier,N.Aghanim&M.Douspis:MILCANN:AneuralnetworkassessedtSZmapforgalaxyclusterdetection Figure5. From top to bottom: Planck tSZ MILCA noise map MILCANN noise map for the full-sky in orthographic projection. Greyregionsaremaskedduetopoint-sourcecontamination. On Fig. 3, the MILCANN map shows that the ANN-filter has galaxyclusterdetection.However,itcannotbeusedtoproduced significantly removed the foreground contamination, especially accurateestimationofthefluxortheshapeoftSZsources. near the galactic plane, where synchrotron and free-free con- taminationhavebeencompletelysuppressed.Similarly,thecon- tamination produced by high latitude galactic cirrus has been filtered out by the ANN. On Fig. 4, we also observe that the 3.2.3. NoiseandCIB-residualssimulations MILCANN map presents a significantly reduced background, butalmostconservestheintensityofbrighttSZpixels.However, WehaveshownqualitativelythatMILCANNtSZmappresents westressthattheANNfilteringdoesnotconservetheshapeof a significantly reduced background compared to MILCA tSZ thetSZsources,asitwillmodifytheintensityoffainttSZpixels map. In this section, we describe our modeling of the noise intheoutskirtsofgalaxyclusters. and CIB-residuals in the MILCA and MILCANN tSZ maps to quantify the improvement obtained by applying an ANN AsshownbyMelinetal.(2012),tSZ-mapbasedgalaxyclus- filtering. terdetectionmethodssufferfromahighlevelofcontamination byspuriousdetectionsinceitisapriorinotpossibletodisentan- gle real tSZ emission from biases induced by other astrophysi- ThetSZmapsarederivedfromcomponentseparationmeth- calemissionresiduals.Bycontrastandduetoitssignificantre- ods. They are constructed through the linear combination of ducedresidualsignals,theMILCANNmapseenswellsuitedfor Planck frequency maps that depends on the angular scale and 6 G.Hurier,N.Aghanim&M.Douspis:MILCANN:AneuralnetworkassessedtSZmapforgalaxyclusterdetection thepixel, p,as (cid:88) (cid:98)y= wi,p,νTi,p(ν), (11) i,j,ν T (ν) is the Planck map at frequency ν for the angular filter i, i,p and w are the weights of the linear combination. Then, the i,p,ν CIBcontaminationinthey-mapreads, (cid:88) y = w (ν)TCIB(ν), (12) CIB i,p i,p i,j,ν where TCIB(ν) is the CIB emission at frequency ν. Using the weights w , and considering the CIB luminosity function, it i,p,ν Figure6.IntensitydistributionofpixelsinthetSZmapsfor:the ispossibletopredicttheexpectedCIBleakageasafunctionof MILCAmap(lightblue),theMILCANNmap(darkblue),sim- the redshift of the source by propagating the SED through the ulationofnoise+CIBintheMILCAmap(orange),simulation weightsthatareusedtobuildthetSZmap.AshownbyPlanck ofnoise+CIBintheMILCANNmap(red). Collaboration2015resultsXXIII(2015),theCIBatlow-zleaks with a small amplitude in the tSZ map, whereas high-z CIB producesahigher,dominant,levelofleakage.Indeed,internal- not present symmetric distribution. So, we are dealing with a linear-combination-based component separation methods focus non-Gaussian, inhomogeneous, correlated noise with an asym- on Galactic thermal dust removal, and thus are less efficient to metric distribution. As observed on Fig.6, the noise is more subtracthigh-zCIBsourcesthatpresentadifferentSED. likely to produce positive value in MILCANN map than nega- The CIB power spectra have been constrained by previous tive value. It implies that the noise has a non-zero expectation Planck analyses (see e.g., Planck Collaboration results XXX value. Consequently, in the following we used and propagated 2014). They can be used to predict the expected CIB leakage, thecompletenoisedistribution. y ,inthetSZmap,y. CIB Weperformed200Monte-CarlosimulationsofCIBthatfollow CIBpowerspectra.Then,weaddinstrumentalnoisetothesim- 3.2.4. Noiseinhomogeneities ulatedCIBmaps.Finally,weapplytheweightsusedtobuildthe tSZ map to these simulations. We obtained 200 realizations of instrumentalnoiseandCIBatthePlanckfrequencymapslevel, consistent with noise and CIB realizations at the MILCA tSZ maplevel. ItisimportanttostressthatthenoiseinMILCAtSZmapis byconstructioncorrelatedtothenoiseinfrequencymaps.Thus, the noise on the ANN classes is also correlated with the noise inMILCAtSZmap.Consequently,ifwewanttoproduceafair descriptionofthenoise,wehavetotrainanotherneuralnetwork on the simulated maps to reproduce the correlation feature be- tweenthenoiseinMILCAmapandthenoiseonANNclassifi- cation.ConsideringthatthetrainingofanANNisanon-linear process,wealsoaddedCMB,pointsources,andthermaldustto thenoise+CIBsimulationsduringthetrainingprocess. Finally, we build and applied the noise-based ANN to the MILCAnoise+CIB-residualssimulation.OnFig.5,wepresent anoise+CIB-residualssimulationbeforeandafterapplyingthe noise-based ANN-filter, we observe that the ANN filtering al- Figure7.Distributionofthenoisestandarddeviation,σ across lows to significantly reduce the noise level in the MILCA sim- y theMILCANNfull-skymap. ulatedmap.FromtheMILCANNsimulatedmap,wederivethe standarddeviationofthenoiseinMILCANNmapbycomputing thelocalstandarddeviationofMILCANNsimulatednoisemap insideatwo-degreegaussianbeam. Due to Planck scanning strategy, the noise level on the sky InFig.6,wecomparetheintensitydistributionsinMILCA isinhomogeneous.Eclipticpolespresentabetterredundancyof andMILCANNmap.FortheMILCAmap,weobserveasignif- observationsandthusasignificantlylowerlevelofnoise(Planck icanttailofnegativeintensity(mainlyproducebyradio-sources Collaboration2015resultsI2016). contamination). We do not observe this contamination in the By construction, the noise of the ANN-filter is related to the MILCANN map, as the ANN filter significantly reduces radio noise in MILCA map. The MILCANN map is obtained by the sources contamination. We also observe that the noise in the product of the filtered MILCA map and the ANN filter, thus MILCANN map is lower than in the MILCA map by a factor thenoiseinhomogeneitiesareamplified.Thedistributionofthe offive.However,wenotethattheintensityofthebrightestpix- pixel-dependentnoiselevelispresentedonFig.7,thatpresents elsinthemapisnotaffectedbytheANNfilter. the noise standard deviation, σ , distribution. This quantity is y We note that the noise in MILCANN is not Gaussian. estimated locally in a 4 degrees gaussian beam. The low noise Considering the correlation between the ANN filter and the regionscorrespondtoeclipticpolesandhighnoiseregionscor- noise in MILCA map, the noise in the MILCANN map does respond to the ecliptic plane. To obtain a map of the signal-to- 7 G.Hurier,N.Aghanim&M.Douspis:MILCANN:AneuralnetworkassessedtSZmapforgalaxyclusterdetection noiseratio,weconstructedthenoisenormalizedquantity,(cid:98)yσ,as 3.2.5. ANNresponse In this section, we quantify the transfert function of the ANN- filteringprocedurebetweenthetSZintensityintheMILCAmap (cid:98)yσ = (cid:98)σyf . (13) andthetSZintensityintheMILCANNmap. y Wenotethatusingahomogeneousthresholdon(cid:98)yσisequivalent tousingapixel-dependentthresholdon(cid:98)yf. Figure9. ANN-filter average value as a function of the match- filteredintensity. By applying the artificial neural network filter, we reduce the noise level. To a lesser extent, we also reduce the intensity oftSZeffectfromgalaxyclusters.Indeed,thevalueofQ isnot N exactly1forallgalaxyclusters.Additionally,thevalueofQ is N sensitivetothefluxofthecluster,ahighfluxgalaxyclusterwill haveQ (cid:39)1,whereaslowfluxgalaxyclusterswillbeclassified N asnoiseandwillhaveQ <<1. N Thus,weselect10000randompixelswithinthe84%maskused for the detection and we added a constant amount of an emis- sionfollowingatSZSED.Thisapproachallowstoproperlyac- count for real sky background and noise level. Then, we apply theANNonthese10000modifiedpixels,andderivedtheaver- agevalueforQ asafunctionoftheinputedtSZintensity,which N givesthetransfertfunctionoftheANNfilteringprocedure. OnFig.9,weshowtheaveragevalueoftheANN-filterasafunc- tionofthefilteredtSZsignalintensity,y .WeobservethatANN f response present a steep transition, all signal below y = 10−8 f Figure8. Distribution of the MILCANN map (top panel) and is completely suppressed by the ANN, whereas signal above MILCANN noise simulation (bottom panel) as function of Q y = 210−7 is almost not affected by the ANN filtering. We N f and normalized intensity, y . The colors show the number of stress that these intensities are obtained after filtering and are σ pixelsinlogarithmicscale. notconsequentlycomparablewithtotaltSZintensityinMILCA map. We verified that we derive a consistent transfert function forQ whencomputedfromnoisesimulationsfromSect.3.2.3. N Figure 8 presents the distributions of MILCANN map and 4. tSZcandidatedetection MILCANN noise simulation respectively as a function of y σ and Q .WeobservethattheMILCANNnoisesimulationdoes Inthissection,weusedtheimprovedtSZmapobtainedafterap- N notshowhigh-Q andhigh-y pixels.Wealsoobservethatfor plyingtheANN-basedfiltertoperformgalaxyclusterdetection. N σ Q (cid:39) 0theMILCANNmappresentsasignificantlylargerdis- Wealsocharacterizethepurityandcompletenessofthederived N tribution of y than the noise simulation. This is produced by galaxyclustercandidatesampletoassesstheimprovementcom- σ foreground residuals that are present in the MILCA tSZ map. paredtopreviousgalaxyclustercatalogderivedfromthePlanck Weverifiedthattheseresidualsarestronglycorrelatedwiththe data. galactic latitude, which ensures that they are related to system- atic effects. However, we do not observe a similar behavior at 4.1. Methodology largervaluesofQ .Thisconfirmsthatforegroundresidualsare N stronglyreducedusingtheANN-filteringprocedure,asalready To detect sources in MILCANN full-sky map, we applied a observedonFig.6. maskofthegalacticplaneandpointsourcesdetectedbyPlanck 8 G.Hurier,N.Aghanim&M.Douspis:MILCANN:AneuralnetworkassessedtSZmapforgalaxyclusterdetection keeping84%oftheskydefinedinPlanckCollaborationresults of extended galaxy clusters. However, this is not an important XXXII (2015). Then, using the noise standard deviation map limitationsinceourmaingoalistodetectcompacttSZsources computedinsection3.2.3,wehomogenizedtheMILCANNmap associatedwithnewgalaxyclustersthatareeitherlow-massor noiseleveltoperformthedetectioninsignal-to-noiseunitonthe high-z.ConsideringtheresolutionofPlancktSZmaps(roughly full-skywithahomogeneousthreshold.Weappliedathreshold, 7 arcmin) such galaxy clusters are point-like. Furthermore for y >3,totheMILCANNmap,andconsiderasdetectedsources theseclustersorformoreextendedones,wecancomputetheir σ allsetSofadjacentpixelsthatareabovethethreshold.Wefinally tSZsignaldireclyfromtheMILCAmaporfromthefrequency cleanmultipledetectionsofasameobjectbymergingdetected maps. sourcesinaradiusof10arcmin(consistentwithPlanckangular resolution)andremoveallsourcesdetectedonlessthan5pixels 4.2.2. Completness of1.7x1.7arcmin2 (toavoiddetectionsproducedbyanomalous pixels). We derive a sample 3969 galaxy cluster candidates. In Applying all the steps we detailed in this section, we can write the following, we refer to this sample of galaxy cluster candi- thecompleteprocessingweapplytotheastrophysicalsignalas datesastheHADcatalog. y (T )Q (y ) y =Y f 500 N f +N , (14) σ 500 σ σ 4.2. CharacterizationoftheMILCANNdetectionmethod y whereY isthetSZfluxofthegalaxycluster,y isthegalaxy Inthissectionwecharacterizethedetectionmethod.Wepresent 500 f clustersmatch-filteredintensitypresentedonFig.10,thedepen- in detail each step that allow us to compute the selection func- dancy of Q with y is presented on Fig. 9, the distribution of tionofthegalaxyclusterdetection.Thus,wepresentadetailed N f thenoiseacrossthemap,σ ,isshownonFig.7,and N isthe descriptionofthegalaxyclustersignaltransfertfunctionacross y σ homogenizednoiseintheMILCANNmap. ourprocessing. Consequently,foragiventotalmass,M ,andagivenredshift, 500 4.2.1. Fourrierspacefilteringresponse Figure11. Completness of the HAD catalog as a function of M andz.Thecolorscaleislogarithmicandrangesfrom10−2.5 500 (darkblue)to1(red). Figure10.CentralintensityoftheSZsignalforagalaxycluster with Y = 1 arcmin2 as a function of galaxy cluster typical 500 radiusT . 500 z, the galaxy cluster signal distribution for y maps is obtained σ by the convolution of the M − Y relation scatter (assum- 500 500 ingscalingrelationfromPlanckCollaborationresults.XX2014, BeforeapplyingtheANN-filtertobuildtheMILCANNmap, and (1-b) = 0.8), the distribution of σ , and the distribution of y wehavefilteredtheMILCAmapusingthematchfilterpresented thenoiseN .WeassumedthattherelationM −T doesnot σ 500 500 onFig.1.Inthissection,wepresenttheestimationofthetrans- presentanyscatter. fertfunctionofthisfilteringprocess.First,webuildamockmap Thecompleteness,C(y ),isthenobtainedastheratioofthein- σ ofaskyprojectedtSZsignalfromagalaxyclusterwithY =1 tegral of y distribution, P(y ), above the detection threshold 500 σ σ arcmin2assumingaGNFWpressureprofile(Arnaudetal.2010) normalizedbytheintegralofthefulldistribution, with1000pixelsperR .Then,weconvolvethetSZmockmap b30y..11.tt1ho.e1Wi0ne0stpareurcrmfmoerinnmt.aFtlhibniseaa5pl0mlr0yo,cawenedduetrxheterfaomcrtavtthcahleutfieSslZtoeirfnRpter5en0s0seirtnaytneagdtintihngeSfcreoecnmt-. C(yσ)= (cid:82)0∞P(1yσ)dyσ (cid:90)t∞P(yσ)dyσ, (15) terofthegalaxyclusterontheconvolvedmockmap. wheretisthedetectionthresholdappliedonthey map. σ OnFig.10,wepresentthetSZintensity,y ,tofluxratioafterap- Figure11showsthecompletenessasafunctionofthemass, f plyingthematch-filterforagalaxyclusterwithauniversalpres- M andredshift,z,ofagivencluster.Weobservethatwecan 500 sureprofile.Thematch-filterweuseselectscompactobjectand detect, with a very basic detection methods applied on an opti- thus presents a response that will significantly reduce the flux mally filtered and cleaned tSZ map, galaxy clusters down to a 9 G.Hurier,N.Aghanim&M.Douspis:MILCANN:AneuralnetworkassessedtSZmapforgalaxyclusterdetection typicalmassofM =11014 M .Wealsoobservethatforvery 500 (cid:12) large mass (> 21015 M ) the completeness is slightly smaller (cid:12) than 1. This effect is produced by the match-filter that signif- icantly reduces the tSZ effect produced by extended (massive) sources. 4.2.3. Purity Then, we estimate the purity of the catalog by performing the detection of tSZ sources on the MILCANN map and the MILCANN noise simulation from Sect. 3.2.3. This estimate doesnotaccountforallforegroundresidualsintheMILCANN map.Thusthisestimatemayslightlyover-estimatethepurityof theHADcatalog.Wefound N(1) detectionsfortheMILCANN det mapandN(2)forthesimulatednoisemap.Weperformedthede- det tectionforseveraldetectionthresholds.Thepurityisobtainedas Figure13.NearestneighbordistancedistributionbetweenHAD P = Nd(1e)t−Nd(2e)t. On Fig. 12 we present the purity as a function of sourcesandPSZ2publiccatalogsources. N(1) det Figure12.PurityoftheHADcatalogasafunctionofthedetec- Figure14.NearestneighbordistancedistributionbetweenHAD tionthreshold. sourcesandMCXCsources. the detection threshold. For the threshold used in the construc- the PSZ2 that have no counterpart in the HAD catalog) among tion of the HAD catalog we derived an estimated purity above which997areconfirmedgalaxyclustersandhaveQ >0.63 90%. Neural in the PSZ2 catalog. We also detect 496 known clusters that are missing from PSZ2 (considering ACT, SPT, Redmapper, 4.3. Comparisonwithreferencegalaxyclustercatalogs Wen+12,andMCXCcatalogs).Figure13showsclearlythetwo populations of sources common and not common with PSZ2. In this section, we present a brief comparison of the HAD Forcommonsourcesweobtainatypicalpositionmismatchbe- galaxy-cluster candidate catalog and other reference catalogs. low 4 arcmin compared to the position presented in the PSZ2. Weusedtwoapprochesforthecomparison. Forveryextendedsourcesthepositionmismatchcanreachupto First we compare the numbers of galaxy cluster candidates 10arcmin.ThispositionmismatchisconsistentwiththePlanck in the HAD catalog, and the predicted number of detections experimentresolution.Thecaseofextendedobjectsstronglyde- assuming Planck-SZ cosmology (Planck Collaboration results. pendsontheexactmethodologyusedtodefinethegalaxycluster XX 2014) and considering the completeness and purity of the position.Nevertheless,consideringthenumberdensityinPlanck HADcatalog.Thispredictednumberisfoundtobe4082±700. SZ catalog and HAD catalog, within a radius of 10 arcmin the The 3969 detected candidates in the HAD catalog is thus very numberofchanceassociationis15((cid:39) 1%ofthePSZ2sample) well consistent with 4082 predicted detections from galaxy andwithinaradiusof4arcminthisnumberis4.Itimpliesthat clusterbasedcosmologicalparameters. a10arcminmatchingdistanceprovidesarobustassociationbe- tweenobjets. Second, we perform a cross-match with reference galaxy Figure14presentsthenearestneighborsdistancedistributionbe- clustercatalogs.Wecompareourdetectedcatalogofcandidates tweenHADandMCXCobjects,thatpresentssimilarproperties withthePSZ2catalog.Thedistributionofnearestneighborsdis- tance is shown on Fig. 13 From our 3969 detections, we find 3 Thisquantityisderivedfromapreviousqualityassessmentusing 1243 in common with the PSZ2 (and there is 418 objects in theANNpresentedinAghanimetal.(2015). 10