ebook img

Convolution and deconvolution based estimates of galaxy scaling relations from photometric redshift surveys PDF

0.21 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Convolution and deconvolution based estimates of galaxy scaling relations from photometric redshift surveys

Mon.Not.R.Astron.Soc.000,1–7(2009) Printed6January2010 (MNLATEXstylefilev2.2) Convolution and deconvolution based estimates of galaxy scaling relations from photometric redshift surveys 0 Ravi K. Sheth1⋆ & Graziano Rossi2† 1 0 1 Centerfor Particle Cosmology, Universityof Pennsylvania, 209 South 33rd Street, Philadelphia, PA 19104, USA 2 2 Korea Institute for Advanced Study, Hoegiro 87, Dongdaemun-Gu, Seoul 130−722, Korea n a J Accepted 2009December 23.Received2009December21;inoriginalform2009October7 6 ] O ABSTRACT In addition to the maximum likelihood approach,there are two other methods which C are commonly used to reconstruct the true redshift distribution from photometric h. redshift datasets: one uses a deconvolution method, and the other a convolution. We p showhowthesetwotechniquesarerelated,andhowthisrelationshipcanbe extended - to include the study of galaxy scaling relations in photometric datasets. We then o show what additional information photometric redshift algorithms must output so r t that they too can be used to study galaxy scaling relations, rather than just redshift s distributions. We also argue that the convolutionbased approachmay permit a more a [ efficient selection of the objects for which calibration spectra are required. 2 Key words: methods: analytical, statistical – galaxies: formation — cosmology: v observations. 4 1 2 1 . 1 INTRODUCTION Therefore, in Section 2.1 we first consider how ζ compares 0 with the true redshift z, and contrast the convolution and 1 The next generation of sky surveys will provide reasonably deconvolutionmethodsforestimatingdN/dz–whileinSec- 9 accurate photometric redshift estimates, so there is consid- tion2.2wedescribehowtoreconstructtheredshiftdistribu- 0 erable interest in the development of techniques which can tiondirectlyfromcolors.Section2.3showswhatthisimplies v: use these noisy distance estimates to provideunbiased esti- if one wishes to use the full distribution L(z|c). Section 2.4 i mates of galaxy scaling relations. While there exist a num- shows how to extend the logic to the luminosity function, X ber of methods for estimating photometric redshifts (Bu- and Section 2.5 to scaling relations, again by contrasting r davari 2009 and references therein), there are fewer for us- a the convolution and deconvolution methods, and showing ing these to estimate accurate redshift distributions (Pad- what generalization of L(z|c) is required from the photo- manabhanetal. 2005; Sheth2007; Limaetal.2008; Cunha metricredshiftcodesifonewishestodothis.Afinalsection et al. 2009), the luminosity function (Sheth 2007), or the summarizes ourresults. joint luminosity-size, color-magnitude, etc. relations (Rossi Wherenecessary,wewritetheHubbleconstantasH = & Sheth 2008; Christlein et al. 2009; Rossi et al. 2010). 0 100h km s−1 Mpc−1,and weassumeaspatially flatcosmo- Ideally, the output from a photometric redshift esti- mator is a normalized likelihood function which gives the logical model with (ΩM,ΩΛ,h) = (0.3,0.7,0.7), where ΩM and Ω are the present-day densities of matter and cosmo- probability that the true redshift is z given the observed Λ logical constant scaled to thecritical density. colors (i.e. Bolzonella et al. 2000; Collister & Lahav 2004; Cunhaet al. 2009). Let L(z|c) denote thisquantity;it may be skewed, bimodal, or more generally it may assume any arbitrary shape. Let ζ denote the mean or the most probable value of this distribution (it does not matter which, although some of the logic which follows is more transparent if ζ denotes 2 TO CONVOLVE OR DECONVOLVE? themean).Often,ζ (sometimeswithanestimateoftheun- certaintyonitsvalue)istheonlyquantitywhichisavailable. In what follows, we will use spectroscopic and photometric redshiftsfromtheSDSStoillustratesomeofourarguments. Detailsofhowtheearly-typegalaxysamplewasselectedare ⋆ Email:[email protected] in Rossiet al. (2010); thephoto-zsforthissample arefrom † Email:[email protected] Csabai et al. (2003). 2 R. K. Sheth & G. Rossi Figure 1.Distributionofthedifferencebetween spectroscopicandphotometricredshifts(z andζ),atfixedz (top)andζ (bottom), in theSDSSearly-typegalaxysample.Notethatp(ζ|z)isratherwellcentered onz,whereasp(z|ζ)isnotcenteredonζ. 2.1 The redshift distribution dN(z,ζ) dN(z) dN(ζ) = p(ζ|z)= p(z|ζ), (1) dzdζ dz dζ Suppose that the true redshifts z are available for a subset are known.Note that of theobjects; for now,assume that thesubsetis arandom dN(ζ) dN(z) subsample of the objects in a magnitude limited catalog. ≡ dz p(ζ|z). (2) dζ dz Ideally, this subset would have the same geometry as the Z fullsurvey,ascross-correlatingtheobjectswithspectraand The algorithm in Sheth (2007) assumes that p(ζ|z), mea- those without allows the use of other methods (e.g. Caler sured in the subset for which both z and ζ are available, et al. 2009). In practice, this may be difficult to achieve – also applies to the full sample for which z is not available. and this is not required for the analysis which follows, pro- Since dN/dζ is measured in the full dataset, and p(ζ|z) is videdthatthephotometricredshiftestimatordoesnothave known, a deconvolution is then used to estimate the true spatially dependent biases (e.g., as a result of photometric dN/dz. calibrations varying across thesurvey). Suppose, however, that one measured p(z|ζ) instead. Then, because For the objects with spectroscopic redshifts, one can studythejoint distribution of ζ and z (seeFigure 1).Typi- dN(z) dN(ζ) ≡ dζ p(z|ζ), (3) cally,mostphotometricredshiftcodesareconstructedtore- dz dζ Z turnhζ|zi≈z.Thecodeswhichdosoaresometimessaidto one could estimate the quantity on the left hand side by beunbiased,buttheyarenotperfect:thescatteraroundthe ‘convolving’ the two measurables on the right hand side. unbiased mean is of order σ ≈0.05(1+z). This scatter, ζ|z Forthedata-subsetinwhichbothz andζ areavailable,this combinedwiththefactthathζ|zi≈z meansthathz|ζi=6 ζ: is correct by definition. Clearly, to use this method on the the fact that hz|ζi is guaranteed to be biased is not widely largerdatasetforwhichonlyζ isavailable,onemustassume appreciated. However, we show below that it matters little thatp(z|ζ)inthesubsetfromwhichitwasmeasuredremains whether hζ|zi or hz|ζi are unbiased – what matters is that accurate in thelarger dataset. thebias is accurately quantified. Rossi et al. (2010) have shown that the deconvolution Inparticular,ifdN/dζ anddN/dzdenotethedistribu- methodaccuratelyreconstructsthetruedN/dzdistribution tion of ζ and z values in the subset of the data where both fromdN/dζ.Figure2showsthattheconvolutionapproach z and ζ are available, then what matters is that p(ζ|z) and also works well, even when only a random 5% of the full p(z|ζ),where dataset is used to calibrate p(z|ζ) – as displayed in Figure Scaling relations from photo-zs 3 Figure2.DistributionofdN/dζ(dotted)anddN/dz(solid);crossesshowtheresultofconvolvingdN/dζwithp(z|ζ)(fromthebottom panelofFigure1). 1. Thus, for the dataset in which both z and ζ are avail- of ζ, then this sum over ζ is really a sum over c, and the able, both the convolution and deconvolution approaches expression aboveis really are valid, whether or not the means (or, for that matter, tahnedmhooswtepvreorbcaobmlepvliaclauteesd)o(fskpe(wz|eζd),amndulpti(mζ|ozd)aalr)etuhnebsiahsaepde, dNdz(z) ≡ dcdNdc(c)p(z|c)= p(z|ci). (5) Z i of these two distributions. This remains true in the larger X dataset where only ζ is known. However, whereas the con- Equation (5) is one of thekey results of this paper. volution approach assumes that p(z|ζ) is the same in the Although we arrived at equation (5) by requiring the calibration subset as in the full one, the deconvolution ap- mapping c→ζ be one-to-one (as may be the case for, e.g., proach assumes that p(ζ|z) is thesame. LRGs), it is actually more general. This is because one can simply measure p(z|c) in the sample for which spectra are inhand,forthesamereasonthatonecouldmeasurep(z|ζ). 2.2 Convolution directly from colors In fact, p(z|c) is an easier measurement, since it does not depend on theoutput of a photo-z code! The constraint on The integral in equation (3) is really a sum over all the the mapping between c and ζ in the discussion above was objects in the photometric dataset, where each object with simply to motivate the connection between photo-z codes estimated ζ contributesto dN/dz with weight p(z|ζ): andtheconvolutionmethod.Oncetheconnection hasbeen dN(z) dN(ζ) made, however, there is no real reason to go through the dz ≡ dζ dζ p(z|ζ)= p(z|ζi). (4) intermediatestepofestimatingζ,sinceallphoto-zcodesuse Z Xi theobservedcolorscanyway.Inthisrespect,equation(5)is Now,recallthatζ wasthemean(ormostprobable)valueof themoredirectandnaturalexpressiontoworkwiththanis a distribution returned by a photometric redshift code. In equation (4). In particular, because p(z|c) is an observable, cases where the observed colours c map to a unique value the convolution approach of equation (5) is independent of 4 R. K. Sheth & G. Rossi Figure 3. Similar to Figure 1, but for the true absolute magnitude and the estimate from the photometry. Notice that p(M|M) is approximatelysymmetricallydistributedaroundM,whereasp(M|M)canbebothsignificantlyoffsetfromMandskewed. anyphoto-z algorithm.Ofcourse,ifthismethodistowork, c and ζ, hz|ci is the same as the quantity hz|ζi which we then the subsample with spectral information must be able discussed in the previoussubsections.) to providean accurate estimate of p(z|c). SatisfyingL(z|c)=p(z|c)isnontrivial.Thisisperhaps mosteasily seenbysupposingthatthetemplateortraining setconsistsoftwogalaxytypes(early-andlate-types,say), 2.3 Relation to photo-z algorithms for which the same observed colors are associated with two different redshifts. In this case, if the photo-z algorithms Theconvolutionmethodoftheprevioussubsectionprovides are working well, then L(z|c) will be bimodal for at least a simple way of illustrating how one should use the output some c. However, if the sample of interest only contains from photo-z codes that actually provide a properly cali- brated probability distribution L(z|c) for each set of colors LRGs, then p(z|c) may actually be unimodal. As a result, c,toestimatedN/dz.Italso showsinwhat sensethecodes L(z|c) 6= p(z|c) unless proper priors on the templates are used, or care has been taken to insure that the training set should be‘unbiased’. is representative of thesample of interest. In particular, equation (5) suggests that one can es- timate dN(z)/dz by summing over all the objects in the dataset, weighting each by its L(z|c). This is because 2.4 The luminosity function dN(z) L(z|ci)= dz if L(z|c)=p(z|c). (6) We can perform a similar analysis of the luminosity func- Xi tion. In this case, the key is to recognize that, in a mag- Equation (6) shows that if L(z|c) does not have the same nitude limited survey, the quantity which is most directly shape as p(z|c), then use of L(z|c) will lead to a bias; this affected by the photometric redshift error is not the lumi- is the pernicious bias which must be reduced – whether or nosity function φ(M) itself, but the luminosity distribution nothz|ciequalsthespectroscopicredshiftis,insomesense, N(M) ≡ V (M)φ(M) (Sheth 2007). In a spectroscopic max irrelevant. (In the case of a one-to-one mapping between survey,N(M)differsfromφ(M)becauseoneseesthebright- Scaling relations from photo-zs 5 Figure 4. Same as Figure 2, but for the absolute magnitudes. Crosses show the distribution one obtains by convolving the dotted histogramwiththedistributionsshowninthebottom panel ofFigure3;solidhistogramshowsthetruedistributionofM. est objects to larger distances: V (M) is the largest co- pendsonM.Figure3shows p(M|M)andp(M|M);notice max movingvolumetowhichanobjectwithabsolutemagnitude howbroadtheyare,andhowmuchmoreskewedandbiased M could be seen. If we use M to denote theabsolute mag- p(M|M) is than p(M|M). Nevertheless, Rossi et al. (2010) nitude estimated using the photometric redshift ζ, and M haveshownthatthedeconvolutionalgorithmproducesgood its correct value, then results.Figure 4showsthat theconvolution algorithm does as well. N(M)= dMN(M)p(M|M). (7) One estimates φ(M) by dividing N(M) by Vmax(M). Since this weight is the same for all objects with the same Z M, one could have added an additional weighting term to Sheth (2007) describes a deconvolution algorithm for esti- thesum aboveto get mating N(M) given measurements of N(M) and the as- sumption that p(M|M), measured in a subset for which p(M|M) φ(M) = dMN(M) both z and ζ (hence both M and M) are available, also V (M) max applies to the full photometric survey. Z N(M) Following the discussion in the previous section, we 6= dM p(M|M). (9) V (M) could instead have measured p(M|M), and then used the Z max fact that One might have written φ(M) = N(M)/V (M), so the max expression above shows explicitly why the photometric er- N(M)= dMN(M)p(M|M) (8) rorsshouldbethoughtofasaffectingN(M)andnotφ(M). Z To make the connection to p(z|c) and then L(z|c) it is to estimate the quantity on the left hand side by summing worth considering how one computes M from z given the overthephotometriccatalogontherighthandside,weight- observed colors c. If there were no k-correction, then the ing each object in it by p(M|M); note that this weight de- luminosity in a given band would be determined from the 6 R. K. Sheth & G. Rossi observed apparent brightness by the square of the (cosmol- c were determined). Hence, the color magnitude relation, ogydependent)luminositydistance–thecolorsarenotnec- which is really a statement about the joint distribution in essary. In practice however, one must apply a k-correction; two bands,can be estimated by this depends on the spectral type of the galaxy, and hence dN(c) on its color. As a result, the mapping between m and M N(M) = dc dzp(M,z|c) depends on z and c. But it is still true that both M and z Z dc Z awrheicdhetweramsipnreedviboyuscl.yTuhseerdeftoorees,ttihmeastpeepc(tzro|cs)coaplsicosaullboswasmopnlee = dcdNdc(c)p(M|c)= p(M|ci). (13) toestimatep(M,z|c).Thequantityofinterest intheprevi- Z Xi ous section, p(z|c), is simply the integral of p(M,z|c) over Galaxy scaling relations can be estimated similarly, if we allM.Thequantityof interesthere,p(M|c),istheintegral simplyinterpretM asbeingthevectorofobservableswhich of p(M,z|c) over all z. Thus, equation (8) becomes can include sizes, etc. (not just luminosities). In principle, quantitiesotherthancolors(e.g.,apparentmagnitudes,sur- dN(c) N(M) = dc dzp(M,z|c) facebrightness,axisratios)canplayaroleinthephotomet- dc ricredshiftdetermination;thiscanbeincorporatedintothe Z Z dN(c) formalism simply by using c to now denote the full set of = dc dc p(M|c)= p(M|ci), (10) observablesfromwhichtheredshiftandotherintrisicquan- Z Xi tities M were estimated. where the second to last expression writes the integral of If one wishes to use the output from a photo-z code, p(M,z|c) over all z as p(M|c), and the final one writes the ratherthan from thespectroscopic subset, onewould use integral explicitly as a sum over theobjects in the catalog. The expression above is the convolution-type estimate N(M)= L(M|ci), (14) of N(M); it does not require a photometric redshift code. i X However,inprinciple,aphotometricredshiftcodecouldout- havingcheckedthat,inthespectroscopicsubset,L(M|ci)= put L(M,z|c): the quantity such codes currently output, p(M|ci). L(z|c),istheintegralofL(M,z|c)overallM.Therelevant weighted sum becomes N(M)= L(M|ci), (11) 3 DISCUSSION i X Weshowed howpreviouswork ondeconvolution algorithms whereL(M|c)istheintegralofL(M,z|c)overallz,thesum for makingunbiased reconstructions of galaxy distributions is over all the objects in the catalog, and the method only andscalingrelations(Sheth2007;Rossi&Sheth2008;Rossi works if L(M|c)=p(M|c). et al. 2010) could be related to convolution-based meth- Note that the luminosity density (in solar units) can, ods.Whereasdeconvolutionbasedmethodsrequireaccurate therefore, bewritten as knowledgeofp(ζ|z),thedistributionofthephotometricred- shiftζ giventhetrueredshiftz,convolutionbasedmethods j ≡ dMφ(M)10−0.4(M−M⊙) requireaccurateknowledgeofp(z|ζ).Sinceζ isderivedfrom Z photometry, this may more generally be written as p(z|c), = dMN(M)10−0.4(M−M⊙) where c is the vector of observed photometric parameters V (M) max which were used to estimate the redshift. In both cases, Z 10−0.4(M−M⊙) p(z|c)andp(ζ|z)arecalibrated fromasampleinwhichz is = dMN(M) dMp(M|M) V (M) known,and are then used in a larger sample where z is not max Z Z available. If the smaller training set has the same selection 10−0.4(M−M⊙) limits as the larger dataset (e.g., both have the same mag- = dMN(M) M * Vmax(M) (cid:12) + nitudelimit)thenbothapproachesarevalid.Weillustrated Z (cid:12) ourargumentswith measurementsin theSDSS(Figures1– (cid:12) 10−0.4(M−M⊙) (cid:12) 4). = * Vmax(M) (cid:12)ci+. (cid:12) (12) We also showed what additional information must be Xi (cid:12) output from photometric redshift codes if their results are (cid:12) The second to last line show(cid:12)s that one requires the average to be used in a convolution-like approach to provide un- (cid:12) ofhL/Vmax(L)isummedoverthedistributionp(M|M);this biased estimates of galaxy scaling relations. In particular, iseasilycomputedfromdistributionslikethoseshowninthe weargued thatonly if theredshift distribution outputbya bottom panel of Figure 3. The final expression writes this photo-zalgorithm,L(z|c),hasthesameshapeasp(z|c),can as a sum overthe observed distribution of colors. the algorithm be said to be unbiased. Only in this case its output(availableforthefullsample)canbeusedinplaceof p(z|c) (which is typically available for a small subset). The 2.5 Galaxy scaling relations safest way to accomplish this is for the training set to be Although the previous section considered the luminosity a random subsample of the full dataset – and to then tune function in a single band, it is clear that the photometric the algorithm so that L(z|c)=p(z|c). If the training set is redshift codes could output L(M,z|c), where M is a set not representative, then care must be taken to ensure that of absolute luminosities (typically, these will be those asso- L(z|c) does not yield biased results. ciated with the various band passes from which the colors Obtaining spectra is expensive, so the question arises Scaling relations from photo-zs 7 as to whether or not there is a more efficient alternative to REFERENCES the random sample approach. For the convolution method, whichrequiresp(z|c),theanswerisclearly ‘yes’.Thisisbe- Bolzonella M., Miralles J.-M., Pello´ R. 2000, A&A, 363, 476 causesomecolorcombinations(e.g.theredsequence)might giverisetoanarrowp(z|c)distribution,whereasothersmay Budava´ri T. 2009, ApJ, 695, 747 Caler M., Sheth R. K., Jain B., 2009, MNRAS,submitted result in broader distributions. Since it will take fewer ob- jects to accurately estimate the shape of a narrow p(z|c) (arXiv:0811.2805) Christlein D., Gawiser E., Marchesini D., Padilla N. 2009, distributionthanabroadone,observationaleffortwouldbe MNRAS,1381 better placed in obtaining spectra for those objects which produce broad p(z|c) distributions. For the deconvolution Collister A.A., Lahav O.2004, PASP,116, 345 Csabai I., et al. 2003, AJ, 125, 580 approach, one would like to preferentially target those red- CunhaC.E.,LimaM.,OyaizuH.,FriemanJ.,LinH.2009, shifts z which produce broader p(ζ|z) distributions – for MNRAS,396, 2379 similar reasons. But, since z is not known until the spectra Lima M., Cunha C. E., Oyaizu H., Frieman J., Lin H., are taken, this cannot be done, so taking a random sample Sheldon E. S.2008, MNRAS,390, 118 of the full dataset is thesafest way toproceed. Padmanabhan N., et al. 2005, MNRAS,359, 237 Our methods permit accurate measurement of many Rossi G., Sheth R.K., 2008, MNRAS,387, 735 scaling relations for which spectra were previously thought Rossi G., Sheth R.K., Park C., 2010, MNRAS,401, 666 to be necessary (e.g. the color-magnitude relation, the size- Sheth R.K., 2007, MNRAS,378, 709 surface brightness relation, the Photometric Fundamental Plane), so we hope that our work will permit photomet- ricredshiftsurveystoprovidemorestringentconstraintson galaxyformationmodelsatafractionofthecostofspectro- scopic surveys. ACKNOWLEDGMENTS RKSthanksL.DaCosta,M.Maia,P.Pellegrini,M.Makler and the organizers of the DES Workshop in Rio in May 2009 where he had stimulating discussions with C. Cunha and M. Lima about the relative merits of convolution and deconvolutionmethods,andtheAPCatParis7Diderotand MPI-AstronomieHeidelberg,forhospitalitywhenthiswork was written up. Funding for the SDSS and SDSS-II has been provided by the Alfred P. Sloan Foundation, the Participating In- stitutions, the National Science Foundation, the U.S. De- partment of Energy, the National Aeronautics and Space Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education FundingCouncil forEngland.TheSDSSWebSiteishttp://www.sdss.org/. The SDSS is managed by the Astrophysical Research Consortium for the Participating Institutions. The Partic- ipating Institutions are the American Museum of Natu- ral History, Astrophysical Institute Potsdam, University of Basel,UniversityofCambridge,CaseWesternReserveUni- versity, University of Chicago, Drexel University, Fermilab, the Institute for Advanced Study, the Japan Participation Group, Johns Hopkins University, the Joint Institute for Nuclear Astrophysics, the Kavli Institute for Particle As- trophysicsand Cosmology, theKorean Scientist Group,the Chinese Academy of Sciences (LAMOST), Los Alamos Na- tional Laboratory,theMax-Planck-Institutefor Astronomy (MPIA),theMax-Planck-InstituteforAstrophysics(MPA), New Mexico State University, Ohio State University, Uni- versity of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval Observatory, and the University of Washington.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.