Submitted to ApJ PreprinttypesetusingLATEXstyleemulateapjv.5/2/11 PROPER IMAGE SUBTRACTION - OPTIMAL TRANSIENT DETECTION, PHOTOMETRY AND HYPOTHESIS TESTING Barak Zackay, Eran O. Ofek and Avishay Gal-Yam BenoziyoCenterforAstrophysics,WeizmannInstituteofScience,76100Rehovot,Israel Submitted to ApJ ABSTRACT Transient detection and flux measurement via image subtraction stand at the base of time domain astronomy. Duetothevaryingseeingconditions,theimagesubtractionprocessisnon-trivial,andex- 6 istingsolutionssufferfromavarietyofproblems. Startingfrombasicstatisticalprinciples,wedevelop 1 the optimal statistic for transient detection, flux measurement and any image-difference hypothesis 0 testing. We derive a closed-form statistic that: (i) Is mathematically proven to be the optimal tran- 2 sient detection statistic in the limit of background-dominated noise; (ii) Is numerically stable; (iii) b For accurately registered, adequately sampled images, does not leave subtraction or deconvolution e artifacts; (iv) Allows automatic transient detection to the theoretical sensitivity limit by providing F credible detection significance; (v) Has uncorrelated white noise; (vi) Is a sufficient statistic for any 0 furtherstatisticaltestonthedifferenceimage,andinparticular,allowstodistinguishparticlehitsand 1 otherimageartifactsfromrealtransients; (vii)Issymmetrictotheexchangeofthenewandreference images; (viii) Is at least an order of magnitude faster to compute than some popular methods; and ] (ix) Is straightforward to implement. Furthermore, we present extensions of this method that make M it resilient to registration errors, color-refraction errors, and any noise source that can be modelled. In addition, we show that the optimal way to prepare a reference image is the proper image coaddi- I . tion presented in Zackay & Ofek (2015b). We demonstrate this method on simulated data and real h observations from the Palomar Transient Factory data release 2. We provide an implementation of p this algorithm in MATLAB and Python. - o r st 1. INTRODUCTION hint is that none of the methods defines the matched fil- a Detection of previously-unknown transient sources is ter that one should use in order to detect transients in [ the difference image. at the base of many fields of astronomy. Examples in- 2 clude: the searches for supernovae, microlensing events IntheAlard&Lupton(1998)andBramich(2008)class of solutions, a complex inversion problem needs to be v and light echos. To remove a constant complex back- solved. Thisinversionproblemcanberegardedasareg- 5 ground, it isusefultoperformdigitalimagesubtraction, ularization effort (e.g., Becker et al. 2012) on the partial 5 a problem that has proven to be hard to tackle, with deconvolution done by Phillips & Davis (1995). Apart 6 several suggested solutions (e.g., Phillips & Davis 1995; from being computationally slow, this inversion problem 2 Alard & Lupton 1998; Bramich 2008; Gal-Yam et al. 0 2008;Yuan&Akerlof2008). Probablythemostpopular is in itself an effective deconvolution, and the numerical . algorithms are by Alard & Lupton (1998) and Bramich instability of the deconvolution process cannot be swept 1 under the rug. These algorithms explore the trade off 0 (2008). between ringing artifacts in the subtraction image, that 6 Current methods have several problems and limita- areduetotheeffectivedivisionintheFourierplane,and 1 tions. An important difficulty in image subtraction is residuals from the constant-in-time sky that are due to : that the point spread function (PSF) of images taken v a failure of equalizing the PSFs of the reference and the from the ground is varying1. In some cases, the subtrac- i newimages. Forexample,ifthePSFofthenewimageis X tion is based on a numerically unstable process (decon- sharperthanthePSFofthereferenceinsomeaxis, then volution) that may generate subtraction artifacts. Com- r thesemethodsfindnogoodsolutionsleadingtomultiple a binedwithill-definederrorpropagation2,itisdifficultto image artifacts. decide if a transient candidate is real or rather due to a These artifacts, along with residuals caused by regis- subtraction artifact. Finally, there is no proof that any trationerrors,appearasfalsepositivesignalsthathinder of the methods we are currently using is optimal. As we the automatic detection of transients. The current state will show in this paper, none of these algorithms is op- of the art solution to this problem is to train a machine- timal. One hint for this is that some of these methods learning algorithm (e.g., Bloom et al. 2012; Goldstein arenotsymmetrictoexchangeofthereferenceimageand et al. 2015; Wright et al. 2015) to filter most of the ar- thenewimage,whiletheproblemissymmetric. Another tifacts and reduce the number of false positives to the minimum. However, this solution is partial and human [email protected] [email protected] scanners are required to sift through all remaining can- 1 Sometimesthisisrelevantalsoforspace-basedobservation. didate detections and decide which is real and which is 2 Thereareseveralreasonswhythecurrentmethodspropagate not (e.g., Gal-Yam et al. 2011; Smith et al. 2011). the errors incorrectly. One reason is that convolution generates This elaborate process can undermine the successful correlatednoise,whichistypicallyignored. Secondisthatusually operation of transient searches in many ways. First, theerrorsinthereferenceimagearenotprojectedcorrectly. 2 Zackay, Ofek and Gal-Yam employing many human scanners can be cumbersome matchfiltering4, reproducestheoptimaltransientdetec- and expensive. Current surveys are spending consider- tion statistic. Using this image, it is possible to detect able manpower on candidate sifting (e.g., PTF). With- and filter out particle hits in both the reference image out further dramatic improvement, this use of human and the new image, separating these artifacts from real scanners is unscalable, and is unfeasible for future sur- transients. Another potential use of this image is the veyslikeZTF(Bellmetal.2015)andLSST(Ivezicetal. optimal detection of photometric variability and astro- 2008). Second, having humans in the loop introduces a metric motion of stars, that works in arbitrarily dense time delay in the transient detection. This can compro- environments. mise science cases in which it is of utmost importance Wedemonstratetheefficacyofouralgorithmonsimu- to make rapid follow-up observations of new transients latedandrealimagesthatarepartofthePalomarTran- (e.g., Cenko et al. 2013, Gal-Yam et al. 2014, Cenko sient Factory (PTF; Law et al. 2009), data release 2. et al. 2015). Moreover, our experience is that at least The outline of the paper is as follows: In §2 we review some machine learning algorithms throw away real ob- the state of the art image subtraction methods, while in vious transients. Furthermore, the human scanning step §3 we derive our optimal transient detection and image makes it difficult to estimate the completeness of tran- subtractionalgorithm. In§4wediscussthepropertiesof sient surveys as human scanners are difficult to properly the derived image subtraction statistic. A step by step simulate. Another problem is that even human scanners summary of the image subtraction process is presented can be unsure if a transient is real or an artifact, and in §5. In §6 we present tests on simulated and real data, many surveys adopt the methodology of accepting only while in §7 we describe our code which is available on- candidates that are persistent in two or more consecu- line. In §8 we discuss the implementation details and we tive observations3 (e.g., Gal-Yam et al. 2011, Baltay et conclude in §9. al.2013). Thismethodologytradesthesurveyspeedwith 2. BRIEFOVERVIEWANDANALYSISOFEXISTING theincreasedcredibilityofthecandidates,andcausesan METHODSFORIMAGESUBTRACTION additional time delay in transient detection. Last, hu- Previously suggested solutions for image subtraction man scanning makes it difficult to detect transients at can be divided into two variants. The first, and more the faintest limit, as it is hard for humans to objectively popular variant, can be referred to as regularized par- quantify the false alarm probability. tial deconvolution. Solutions we include in this family In this paper, we present a closed-form solution for are Phillips & Davis (1995); Alard & Lupton (1998) and image subtraction in general, and transient detection in Bramich (2008). Gal-Yam et al. (2008) suggested a sec- particular. Starting with the most basic statistical prin- ond variant, which we call cross filtering, while Yuan & ciples, we solve the problem of transient detection un- Akerlof (2008) advocated for a mix of the two methods. der the assumption that both the reference and the new Denoting the new image by N and its PSF by P , images have white Gaussian noise (e.g., the background- N the reference image by R and its PSF by P , the first noise- or read-noise-dominated limit). We then charac- R approach attempts to find a convolution kernel k such terizethestatisticalbehaviorofourclosed-formtransient that: detectionstatisticundertheinfluenceofsourcenoiseand astrometric errors. Based on this analysis, we then con- N −k⊗R∼=0. (1) structacorrectiontermtothetransientdetectionstatis- tic that prevents false positive detections in the vicinity Here ⊗ represents convolution. of bright objects. Our solution is always numerically The first solution for finding the kernel k was given by stable, is trivial to implement and analyze, and is sig- Phillips & Davis (1995). They suggested to perform a nificantly faster computationally than the popular algo- deconvolution solution in Fourier space: rithms (e.g., Alard & Lupton 1998; Bramich 2008). We extend the transient detection statistic to the situation (cid:98)k = P(cid:99)n ∼= N(cid:98), (2) of multiple references, and show that the optimal refer- P(cid:99)r R(cid:98) ence image for image subtraction is the proper coaddi- were represents Fourier transform. However, this so- tion image given in Zackay & Ofek (2015b). Finally, we (cid:99) lution is numerically unstable as the deconvolution op- show that the transient detection statistic is the maxi- eration can (and many times does) involve division by malS/Nestimatorfortransientfluxmeasurementinthe small numbers. This problem is apparent from Equa- background-dominated noise limit. tion 2, where the denominator might approach zero as We further develop the optimal transient detection fast or faster than the numerator. Given that any mea- statistic into a difference image statistic that has white surement process contains noise, this division operation noise. Then, we show that any statistical measure- amplifies the noise in Fourier space, which in turn gen- ment or decision on the data can be performed opti- erates correlated noise in real space. The extreme cases mally and intuitively on this difference image, which ofthiscorrelatednoisearethecharacteristicringingand we call the proper image subtraction statistic. This im- sinusoidal artifacts that deconvolved images suffer from. age has many good qualities such as: in the case of no Alard & Lupton (1998) suggested a practical way to difference between the reference and the new image, it mitigate the numerical instability problem. Represent- has expectancy zero everywhere and uncorrelated addi- ing k as a set of basis functions, and noting that Equa- tive Gaussian noise. It has an effective PSF that, by tion 1 is linear, they suggested to solve for k using linear 4 Also called cross-correlation of the images with its PSF. See 3 Thisstepisalsorequiredforunknownminorplanetidentifica- e.g., Zackay & Ofek (2015a) for a derivation of the matched filter tion. solution. Proper Image Subtraction 3 least squares. Alard & Lupton (1998) suggested to use number of solutions. This is because for any K , we can r a set of basis functions which are linear combinations of find K that satisfies Equation 4 in the least squares n Gaussians multiplied by low degree polynomials. Later sense. It is clear from this simple analysis that all sub- on, Bramich(2008)suggestedtosolveforthevaluesofa traction methods mentioned are focused on making the pixelizedkernel. Alloftheabovemethodscanbeviewed PSFofthetwoimagesidentical,withverylittleattention asregularizationofthedeconvolutionmethodofPhillips tothemaximizationofthesignal-to-noiseratio(S/N)of & Davis (1995) – i.e., restricting the solutions for the atransientsourcethatappearsinoneoftheimages. Ina kernelk tofinitesizeandtosomesetoflogicalsolutions. sense,thesemethodsdonotsolvethetransientdetection Even though the numerical stability of these algorithms problem, but a different problem which is how to make is much better than that of Equation 2, they still have two images as similar as possible using convolution. In several problems. First, the division by zero problem is this paper, we rigorously derive a method that cancels stillthereanditcanbecomeespeciallypronouncedwhen the constant-in-time image and maximizes the S/N of a thenewimagehasanarrowerPSF(includingaPSFthat transientsourceatthesametime. Wenotethatthereare is narrower in anysingle axis) compared tothe reference severalwaystoderivethismethod. Herewewillderiveit image. It is interesting to note that these methods are fromfirstprinciplesviamodelingthetransientdetection not symmetric to the exchange of the new and reference with simple hypothesis testing and using the lemma of image, while the problem is symmetric to this exchange. Neyman & Pearson (1933). Second, although these methods are intuitive, they lack statistical justification, there is no rigorous proof they 3. STATISTICALDERIVATION cause no information loss, and it is unclear what further Giventhenumerousproblemswithexistingimagesub- image processing should be applied. For example, do we traction methods, we would like to place the transient need to apply another matched filter to the subtracted detection problem on firm statistical grounds. In §3.1 image in order to detect transients? If so, which filter weoutlinethederivationandformulaeofourimagesub- shouldweuse? Third, usingthesemethodstheresulting traction statistics. Given that the full derivation is te- pixel noise is correlated and there is no simple analytic dious we defer it to Appendix A. In §3.2 we show that prescriptiononhowtosetadetectionthresholdfortran- the best way to build a reference image, for the purpose sientsearch5. Therefore,itishardtodecideifadetected of image subtraction, is to use the image coaddition al- sourceisrealoranartifact,ortoquantifytheprobability gorithm of Zackay & Ofek (2015b). Our derivation in of it being a false positive. In addition, in the effort of §3.1assumesthattheimagesarebackground-noisedom- suppressing the deconvolution artifacts, these solutions inated (i.e., the objects we care about have source noise sacrifice the cancellation of the constant-in-time image. which is lower than the background noise). This causes This will cause large and pronounced subtraction arti- an underestimation of the noise near bright sources. In facts, that will prevent identification of transients that §3.3 we present a simple correction to the image sub- are substantially fainter than their hosting environment. traction formulae that takes care of the source noise and Finally, using inversion methods for image subtraction other errors, like registration noise. In §3.4 we present (i.e., linear least squares) makes the subtraction process an accurate treatment of astrometric shifts, noise and slow, compared with e.g., the Fourier space solution of color-refraction errors. In §3.5 we outline our suggested Phillips & Davis (1995). method to equalize the flux zero points of the new and The cross filtering solution suggested by Gal-Yam et referenceimages. In§3.6weprovideanalgorithmforop- al. (2008) is to convolve the new image with the PSF of timalPSFphotometryinthesubtractionimage,whilein the reference image and to convolve the reference image §3.7wedescribehowthismethodcanbeusedforcosmic- with the PSF of the new image: ray, bad pixels and reflection-ghost identification. S =P ⊗N −P ⊗R. (3) GY08 r n 3.1. Transient source detection using image subtraction This solution is always numerically stable, and leaves no Here we derive, from first principles, an optimal subtraction artifacts. The problem with this solution is, methodfortransientsourcedetection,undertheassump- again, the lack of statistical justification, and that the tions that the images are background-noise dominated, matched filter for source detection is not specified. and the noise is Gaussian and independent6. Yuan & Akerlof (2008) suggested to apply kernels for Let R and N be the background-subtracted reference both R and N, both chosen from a family of PSFs de- imageandthebackground-subtractednewimage,respec- termined by few parameters, and to drive the solution tively. DenotebyT thebackground-subtractedtruecon- towards spatially small kernels by adding the effective stant sky image. Denote by P and P the point spread PSF area to the loss function. r n functions(PSFs)ofthereferenceimageandthenewim- It is worthwhile to note that the problem of subtract- age, respectively. P and P are normalizedto have unit ing two images, and minimizing the resulting difference r n sum. We assume that P , P , and the flux-based zero image in the least square sense has an infinite number of n r points7 of the new image (F ) and reference image (F ) solutions (see also Yuan & Akerlof 2008). For example, n r are known. We present a method for finding F and F the linear equation: n r in§3.5,andthePSFmeasurementsarediscussedin§8.2. K ⊗R−K ⊗N ∼=0, (4) r n 6 Inpracticethepixelsmaybeslightlycorrelatedduetocharge where Kr and Kn are arbitrary kernels, has an infinite repulsionandchargediffusioninaCCD. 7 FollowingZackay&Ofek(2015a,2015b)thisfactorrepresents 5 One method to estimate the noise level is using Bootstrap the product of atmospheric transparency, telescope and detector simulations(e.g.,Ofeketal.2014). transmissionandintegrationtime. 4 Zackay, Ofek and Gal-Yam The expression for the reference image is: in the score image refers to a different q position. It is important to note that Equation 12 is a matched filter R=F T ⊗P +(cid:15) , (5) r r r imageandnofurtherfilteringisrequired. Inordertofind where(cid:15) istheadditivenoisecomponentoftheimageR. transients all we need to do is to identify local maxima r Giventhenullhypothesis,H ,thatstatesthereareno (or minima) in S. The significance of a local maximum, 0 new sources in the new image we can write: in units of sigmas, is given by its value divided by the standard deviation of the image S. N =F T ⊗P +(cid:15) . (6) |H0 n n n Since Equation 12 is a matched filter image, its pix- els are correlated, and any hypothesis testing or mea- Given the alternative hypothesis, H (q,α), that states 1 surement, other than transient detection and photom- there is a new point source at position q with flux α in etry (see §3.6), requires a knowledge of the covariance the new image, we can write: betweenthepixels. Anexampleforsuchhypothesistest- N =F T ⊗P +αF δ(q)⊗P +(cid:15) , (7) ing is cosmic-ray identification via image subtraction, or |H1(q,α) n n n n n searching for variable nebulosity (e.g., light echos). In where δ(q) denotes a two dimensional image with one order to have an image-subtraction method that is op- at position q, and zero otherwise. We assume that the timal for all purposes and easy to use, we need to iden- dominantsourceofnoiseisthebackgroundnoise, (cid:15) and r tifyanimagewhosepixelnoiseisuncorrelated, andthat (cid:15) both satisfy that all pairs of pixels are uncorrelated – n cross-correlating this image with its own PSF returns i.e., that for all pairs of pixels x ,x for which x (cid:54)=x : 1 2 1 2 Equation 12. In Appendix A we identify such an image Cov((cid:15) [x ],(cid:15) [x ])=0,Cov((cid:15) [x ],(cid:15) [x ])=0, (8) as: r 1 r 2 n 1 n 2 and that all pixels have spatially uniform variance8: D(cid:98) = (cid:113) FrP(cid:99)rN(cid:98) −FnP(cid:99)nR(cid:98) . (13) V((cid:15)r[x])=σr2,V((cid:15)n[x])=σn2. (9) σn2Fr2|P(cid:99)r|2+σr2Fn2|P(cid:99)n|2 Because both hypotheses are simple9, we can use the The PSF of this image, normalized to have unit sum, is Neyman-Pearsonlemma(Neyman&Pearson1933),that given by: states that the most powerful10 statistic for deciding be- tween two simple hypotheses is the likelihood ratio test: P(cid:99)D = (cid:113) FrFnP(cid:99)rP(cid:99)n , (14) L(q,α)= P(N,R|H0) , (10) FD σn2Fr2|P(cid:99)r|2+σr2Fn2|P(cid:99)n|2 P(N,R|H (q,α)) 1 where F is the flux-based zero point of the subtraction D where P denotes probability. A critical point is that image, which is given by: we do not have any prior information or assumptions F F on T. Therefore, we cannot calculate the probabilities F = r n (15) D (cid:112) P(N,R|H0)andP(N,R|H1(q,α))directly. However,we σn2Fr2+σr2Fn2 cancalculatetheirratiobydevelopingtheexpressionus- Indeed, using this difference image D and its PSF we ing the law of conditional probabilities can verify that the cross-correlation of D with P re- D P(N|R,H )P(R|H ) turns: L(q,α)= 0 0 . (11) ←− P(N|R,H1(q,α))P(R|H1(q,α)) S =FDD⊗PD, (16) Next we can use the fact that H0 and H1 predict the wherethebackwardarrowsigndenotescoordinaterever- same likelihood to the reference and cancel out the last ←− sal (i.e., P(x,y)=P(−x,−y)). Alternatively in Fourier multiplicative terms in the numerator and denominator. space After some algebra, which is detailed in Appendix A, we can find the optimal statistic for source detection S(cid:98)=FDD(cid:98)P(cid:99)D. (17) S(cid:98)≡ 1(cid:92)logL= FnFr2P(cid:99)n|P(cid:99)r|2N(cid:98) −FrFn2P(cid:99)r|P(cid:99)n|2R(cid:98), (12) doImtinisateimdpnooritsaenltimtiot, nDoties athparto,peinr itmhaegeb,aacnkdgrohuenncde- α σr2Fn2|P(cid:99)n|2+σn2Fr2|P(cid:99)r|2 we call it the proper subtraction image. As in Zackay & where the over-line symbol denotes the complex conju- Ofek (2015b) we define a proper image to be an image gate operation. We note that by putting the over-line whose noise is independent and identically11 distributed sign above the hat sign we mean that the complex con- (i.i.d). ThismeansthatDcanbeusedforanyhypothesis jugateoperationfollowstheFouriertransformoperation. testingormeasurement, withouttheneedforthecovari- Thisstatistic(orscoreimage)issimplythelog-likelihood ance between the pixels. Furthermore, in Appendix E ratio test between the two hypotheses. This score is cal- we present a proof that D and P are in fact sufficient D culatedsimultanouslyforallvaluesofα,whileeachpixel statistics12 for any hypothesis testing or measurement. 8 As the convolution is a local operation, this assumption can 11 In practice the noise levels need to be identical only locally berelaxed(seediscussioninZackay&Ofek2015a). (on scales which are twice the PSF size), as the convolution is a 9 A simple hypothesis has no unknown parameters. We are localoperation. InthevicinityofbrightstarsD isnotproper. applyingthehypothesistestingtoeachvalueofαandqseparately. 12 In statistics, a statistic is sufficient with respect to a sta- 10 Thepowerofabinaryhypothesistestistheprobabilitythat tistical model and its associated unknown parameter if no other the test correctly rejects the null hypothesis when the alternative statisticthatcanbecalculatedfromthesamesampleprovidesany hypothesisistrue. additionalinformationastothevalueoftheparameter. Proper Image Subtraction 5 Equation 13 and its PSF (Eq. 14) are adequate for The PSF (normalized to have unit sum) of the reference detection of objects whose original shape was convolved image is given by: withthetelescopeandatmospherePSF.However,parti- (cid:114) cdleerihviettehveenPtSsFdoinntohtesdhiaffreeretnhcisePimSaFg.eIDn,Aopfpaenδ-dfiuxnEctiwone (cid:80)j Fσj22|P(cid:99)j|2 in N or R. The PSF in the difference image D of a P(cid:99)R = Fj , (23) δ-function in N is: r where F is the flux-based zero point of the reference: P(cid:100)DN = (cid:113) FrP(cid:99)r , (18) r (cid:118) FDn σn2Fr2|P(cid:99)r|2+σr2Fn2|P(cid:99)n|2 Fr =(cid:117)(cid:117)(cid:116)(cid:88)Fσj22. (24) while the PSF in the difference image D of a δ-function j j in R is: Not surprising, this is identical to the optimal coaddi- P(cid:100)DR = (cid:113) FnP(cid:99)n . (19) ttihoantmtheethroeadsodnerRivepdreinseZrvaecskaayll&thOefienkfo(r2m0a1t5ibo)n.fWroemntohtee FDR σn2Fr2|P(cid:99)r|2+σr2Fn2|P(cid:99)n|2 individual references is because in the computation of each frequency in R, we add random variables scaled by These PSFs are also accompanied by the corresponding their (conjugate) expectation, divided by the variance. zero-points,F ,F thatcanbefoundinAppendixE. DN DR We can identify this operation as the maximal S/N ad- These equations are useful if one would like to search dition of random variables (see Appendix A of Zackay & foreventswhicharesimilartoadeltafunction(e.g.,bad Ofek 2015a). The reader should refer to Zackay & Ofek pixels). We note that P(cid:100)DN and P(cid:100)DR in many cases can (2015b) for analysis and proof of sufficiency of this so be approximated by a delta function. called proper coaddition method. To summarize, in order to find a transient source in either the reference or the new image we can calculate D (Eq. 13) and cross-correlate it with its PSF (Eq. 14). 3.3. Simple, suboptimal correction for source noise, astrometric noise and color-refraction noise Alternatively, we can calculate directly the statistic S (Eq. 12). Equation 12 ignores the source noise, and hence the noise level is underestimated in the vicinity of bright 3.2. Construction of the reference image stars. The outcome of this will be that bright sources may be flagged as possible transients or variables. Fur- Typically,thereferenceimageisbuiltbycoaddingmul- thermore, this equation ignores any additional impor- tiple images. Here we will show that the best way to tant sources of noise like astrometric noise, astrometric produce a reference image for subtraction is using the scintillationnoise,color-refractionnoise,fluxscintillation method described in Zackay & Ofek (2015b). noise, and position-dependent flat-fielding errors. In the case of multiple reference images we need to A simple correction to this problem, albeit subopti- replace Equation 5 with the model for the j-th reference mal, is to divide S by a correction factor that takes into image: account the local estimated variance of the extra noise. R =F P ⊗T +(cid:15) . (20) j j j j Derivation of this correction factor is presented in Ap- Here F is the flux-based zero point of the j-th reference pendixC.Intheimagespace, theexpressionforthecor- j image, P is the PSF of the j-th reference image, and (cid:15) rected S is j j is the noise of the j-th reference image. S Asbefore,themodelforN assumingthenullhypothe- S = . corr (cid:112) sis, H , is given by Equation 6, while if the first hypoth- V(S )+V(S )+V (S )+V (S )+... 0 N R ast N ast R esis, H , is true then N is given by Equation 7. (25) 1 As in the previous section, we would like to decide be- Here the terms in the denominator may include any tweentwosimplehypotheses. Therefore,theoptimaltest position-dependent contribution to the variance, that is statistic is the likelihood ratio test (Neyman & Pearson not included in the σ2 and σ2 factors. n r 1933) In this example we list two specific contributions from the source noise and from astrometric noise. The first P(N,R ,...,R |H ) L(q,α)= 1 J 0 . (21) two terms in the denominator are the variance from the P(N,R1,...,RJ|H1(q,α)) source noise in the new and reference images, respec- tively, while the next two terms are the variance due As before, we can use the law of conditional probabili- to astrometric noise. Other sources of noise like color- ties, and the fact that H and H predict the same like- 0 1 refraction can be added in a similar manner. lihood for all references. The full derivation is presented HereV(S )isthevarianceofthepartofS containing in Appendix B, and after some algebra we find that the N N given by optimal reference image is given by V(S )=V((cid:15) )⊗(k2), (26) N n n R(cid:98)= (cid:114)(cid:80)j Fσj2jP(cid:98)jR(cid:99)j . (22) and V(SR) is the variance of the part of S containing R (cid:80)j Fσjj22|P(cid:98)j|2 given by V(SR)=V((cid:15)r)⊗(kr2), (27) 6 Zackay, Ofek and Gal-Yam and the Fourier transform of k is given by induced by the Earth turbulent atmosphere (see §8.5). r Therefore, even in the case of high quality registration, k(cid:98)r = FrFn2P(cid:99)r|P(cid:99)n|2 , (28) wuaelsexdpueecttothaasttraollmbertigrihctssctianrtsilwlaitlilohnavneoissueb.tractionresid- σr2Fn2|P(cid:99)n|2+σn2Fr2|P(cid:99)r|2 Fortunately, duetotheclosedformandnumericalsta- while the Fourier transform of k is bility of our method, the shape of the subtraction resid- n uals is fully predictable, given the astrometric shift and k(cid:99)n = FnFr2P(cid:99)n|P(cid:99)r|2 . (29) trhefeerfleunxcedaiffnedreansciet abpetpweaeerns itnhtehsetanrewasimitaagpep.eTahrserienfotrhee, σr2Fn2|P(cid:99)n|2+σn2Fr2|P(cid:99)r|2 wecanusethistomeasuretheastrometricshiftandflux The variance of (cid:15) and (cid:15) are simply the variance im- variability for each star. n r ages. For a single image the variance map, V((cid:15) ), is Foradequately13 sampledimages,thisproposedmech- n simply the number of electrons in each pixel (including anismisaccurate,anditallowsustomeasureastrometric the background), added with the readout noise squared. shifts and variability in very crowded fields. The details However, in the case of multiple images, the correct way of this method will be presented in a future publication, to construct V(S ) is to calculate k , V((cid:15) ), and V(S ) buthereweprovideabriefoutline: Theastrometricshift R r r R for each reference image and to sum all the individual and photometric variability kernel is: V(S ) values up (see Appendix B). However, in many R cases a reasonable approximation is to calculate kr from P(cid:98)S(αn,αr,∆x,∆y)=P(cid:99)D(αr−αns(cid:98)). (34) theproperlycoadded image, andcalculateV((cid:15) )using a r simple addition of all the images (in units of electrons) Hereαn isthefluxofthesourceinN andαr isitsfluxin from which the reference was constructed (i.e., the num- R, ands(cid:98)istheshiftoperator(includingsub-pixelshifts) ber of electrons in each pixel including the background) inFourierspace. Thisoperatorisafunctionoftheshifts added with the total readnoise squared. ∆x and ∆y. Using Equation 34, we can treat residuals Next, the astrometric variance terms are given by detected in S more carefully than we did in §3.3. Specif- ically, we can now perform hypothesis testing to decide (cid:16)dS (cid:17)2 (cid:16)dS (cid:17)2 V (S )=σ2 N +σ2 N , (30) between e.g., H0: changes are consistent with stationary ast N x dx y dy and non variable source; or H : the star moved or its 1 flux changed. This scheme can be applied to any part where σ and σ are the astrometric registration noise x y of D, for which we identify a significant peak in S (e.g., in the x and y axes, respectively, while ddSxN and ddSyN are above 3σ). Apart from using this to eliminate false pos- the gradients of S in the x and y directions. Here the itives, we can now use this to detect and measure new N Fourier transform of S is given by kindsofsignals. Forexample, wecanuseittosearchfor N moving objects blindly, even in the presence of complex, S(cid:99)N =k(cid:99)nN(cid:98). (31) constant in time, structure in the background. In a similar manner (cid:16)dS (cid:17)2 (cid:16)dS (cid:17)2 3.5. Matching the local zero points, background flux and V (S )=σ2 R +σ2 R . (32) astrometric shift ast R x dx y dy Oursolutionsofarassumedthatthevaluesoftheflux- Here the Fourier transform of SR is given by based zero points (Fr and Fn), the background levels, (B and B ), and the relative astrometric shift (∆x and S(cid:99)R =k(cid:98)rR(cid:98). (33) ∆yn) are knrown. Careful analysis of Equation 13 shows Theoriginofthesetermsisthatastrometricnoisecauses that, in practice, we only care about the flux zero points shifts in individual PSFs. The noise induced by these ratio shiftsisproportionaltothedifferencebetweenneighbor- β ≡F /F , (35) ing pixels (i.e., the gradient). n r We note that in practice the astrometric registration the background difference, noise is the rms of the registration fitting process. This term include both registration errors and the astromet- γ ≡B −B , (36) n r ric scintillation noise. In some cases the quality of the registration is position dependent. In this case it is pos- and the translation (∆x, ∆y). sible to replace the scalars σX and σY by matrices of By substituting Equations 35 and 36 into D (Eq. 13), the position-dependent noise. In §3.4 we suggest a more andintroducingtheshiftoperatorwecangetthedesired accurate treatment of the astrometric noise component. expressionweneedtominimizeinordertofindβ,γ,∆x, and ∆y. This can be done either locally (in small sec- 3.4. Accurate treatment of astrometric noise and flux tions of the image), or globally. For simplicity, and since variability we already discussed astrometric shifts in §3.4, here we Astrometric errors and shifts are a major problem for neglecttranslations, buttowardtheendwewillmention imagesubtraction. Forexample,forabrightsourcewith how this can be incorporated. 104electrons and full-width at half maximum (FWHM) of 2pixels, the astrometric error induced by the Poisson 13 ByadequatelysampledimageswemeanthatthePSFwidth noise will be about a few tens of milli-pixels. This is is sampled by at least two pixels. This can be referred to as the equivalent to the typical astrometric scintillation noise NyquistsamplingofthePSFbythecamera. Proper Image Subtraction 7 In order to find β and γ we need to compare the two where f indicates spatial frequencies. The standard de- parts of D(cid:98): viation of this estimator is (cid:112) V(S )+V(S ) D(cid:99)n(β)= (cid:113) P(cid:99)rN(cid:98) , (37) σα(cid:103)(q) = NFS R , (43) σn2|P(cid:99)r|2+β2σr2|P(cid:99)n|2 whereV(S )andV(S )aredefinedinEquations26–27. N R and Note that Equation 41 can be used to measure the PSF flux of all the transients in the image simultanously. D(cid:99)r(β)= (cid:113) P(cid:99)nR(cid:98) . (38) 3.7. Cosmic ray, bad pixel and ghosts identification σn2|P(cid:99)r|2+β2σr2|P(cid:99)n|2 TheimagesubtractionstatisticD canbeusedtoiden- Note that we replaced F and F by β. All we need tify cosmic rays and bad pixels. A major advantage of n r using the proper image subtraction over other image- to do is to inverse Fourier transform D(cid:99)n and D(cid:99)r and to differencing techniques is that its pixels noise is uncor- solvethefollowingnon-linearequationforβ andγ(cid:48) (and related and usually it roughly preserves the shape of optionally ∆x and ∆y): sources which are similar to δ-functions. This means D (β)=βD (β)+γ(cid:48) (39) thatinmostcasesonecanidentifyparticlehitsbyapply- n r ing edge-detection algorithms (e.g., van Dokkum 2001), where without any modifications, directly on D. γ An alternative approach is to use a rough model for γ(cid:48) = . (40) (cid:112) the shapes of particle hits and bad pixels, and to per- σ2 +β2σ2 n r form a composite hypothesis testing. The log-likelihood Notethatthesolutionshouldbeperformedintheimage ofobservingD,ifanobjectatpositionqisapointsource domain. If we are interested in solving also for small transient (Hps hypothesis) with flux α, is given by translations,weneedtomultiplyD(cid:99)r andγ(cid:48)withtheshift −log(P(D|H (q)))=(cid:88)||D−αF ←P−⊗δ(q)||2, operator. If we trust that the images were background ps D D subtracted and aligned correctly, then we can set γ =0, x (44) ∆x = 0, ∆y = 0 and use the same expression to solve only for the value of β. whilethelog-likelihoodofD iftheobjectatpositionq is Equation 39 is non-linear in β. Therefore iterative so- a cosmic ray with flux α and with shape P in N, (H lutions are required. For example, in the first iteration cr cr hypothesis) is set β =1 and solve for the new value of β, and use it in the next iteration to find a new value of β, until conver- −log(P(D|H (q)))=(cid:88)||D−αP ⊗←P−−⊗δ(q)||2. gence14. Furthermore, it is important to note that one cr cr DN x must use robust fitting methods in order to solve Equa- (45) tion 39. The reason is that there may be bad pixels, particle hits, astrometric noise, and saturated pixels in Here x is the subset of pixels that contain the source the images. It is also recomended to remove the images- of interest (e.g., an area with a width twice that of the edge pixels prior to fitting β. PSF around the source). The difference between Equa- tions 44 and 45 (using appropriate priors, such as the 3.6. PSF photometry in the difference image probability of seeing a transient at a certain magnitude andtheprobabilityofseeingacosmicraywiththisflux) In this section we present a statistic for measuring the PSF photometry15 of a source in the difference image. is a statistic that can be indicative (after setting the ap- propriate threshold) for deciding whether the detected Thismeasurementstatisticisunbiasedandhasmaximal transient is a cosmic ray or an astronomical transient. S/N amongallestimatorswhicharelinearcombinations We note that in this case the flux of the source, and the of the input images. However, this statistic is optimal intensity and shape of the cosmic ray are free param- only for the background-dominated-noise limit. A full eters of the model. Therefore, this is a classic case of derivation of this statistic is presented in Appendix D. composite hypothesis testing. ThebestlinearestimatorforthePSFphotometryofa The same approach can be used to identify internal- source at position q is reflection ghosts. In this case we need to replace the S(q) shape P with the shape of a reflection ghost. For ex- cr α(cid:103)(q)= . (41) ample, an extended kernel (e.g., top hat filter) which is F S wider than the stellar PSF. Here F is the flux normalization of S: S 4. PROPERTIESOFTHENEWIMAGESUBTRACTION F =(cid:88) Fn2|P(cid:99)n|2Fr2|P(cid:99)r|2 , (42) METHOD S f σr2Fn2|P(cid:99)n|2+σn2Fr2|P(cid:99)r|2 tioNnopwrotbhlaetmw,ewheacvaenaannoaplytzimeaitlssporluoptieorntifeosratnhdecsoumbtpraarce- it to other methods, seeking an intuitive understanding. 14 Wefoundthatusuallyβ convergesin2–3iterations. 15 PSFphotometryrefersto(effectively)fittingthesourcewith 4.1. Optimality aPSF. 8 Zackay, Ofek and Gal-Yam Our image subtraction and transient detection formu- which means that all the spatial frequencies of D(cid:98) have lae were derived using the lemma of Neyman & Pearson equal variance. Furthermore, since we assume that the (1933). This ensures that whenever our assumptions are images have white noise, their Fourier transform has correct our method is optimal. Our assumptions: the white noise. This means that the spatial frequencies of images are registered, dominated by uncorrelated Gaus- D(cid:98), as a linear combination of R(cid:98),N(cid:98), has un-correlated sian background noise, and that the PSFs, background, noise. Together, both properties mean that D(cid:98) has white variance and flux-based zero points are known. noise,whichmeansthatD hasalsowhitenoise. Inother words,thedifferenceimageisaproperimage(asdefined 4.2. The constant-in-time image T cancels in Zackay & Ofek 2015b). This property is violated by For perfectly registered images, both the optimal all the other methods for image subtraction. proper difference image (D) and transient detection We note that in the vicinity of bright stars, where the image (S) are free of subtraction residuals from the source noise variance is dominant, the proper subtrac- constant-in-time image. This is because the constant- tion image D exhibits correlated noise. Our simulations in-time image T algebraically vanishes. suggest that if the source variance is at least an order This is not the case in the subtraction methods sug- ofmagnitudehigherthanthebackgroundvariance, than gested by Alard & Lupton (1998) and Bramich (2008). correlated noise is detectable by eye in the vicinity of In these methods an optimum for the trade-off between such sources. However, as we stated before, using our magnifyingtheimagenoiseandminimizingtheconstant- method,thesourcenoiseiscontrollableviavariancecor- in-time residuals of T was explored. rections. 4.3. Numerical stability 4.6. D and P are sufficient for any measurement or D InspectingEquations12,13and14,itisapparentthat decision on the difference between the images ifthedenominatorisapproachingzero,thenthenumera- In statistics, a statistic is sufficient with respect to tor is approaching zero even faster. Therefore our image a statistical model and its associated unknown param- subtraction method is numerically stable for all combi- etersifnootherstatisticthatcanbecalculatedfromthe nations of PSFs for the new and reference images. same sample provides any additional information as to WenotethattheAlard&Lupton(1998)andBramich the value of the parameter. In Appendix E we provide (2008) methods are numerically unstable in the general a proof that D and P are sufficient for any measure- case,asthesemethodseffectivelyperformdeconvolution. D mentorhypothesistestingonthedifferencebetweenthe It is true that if the PSF of the reference image is nar- images. The key ingredients for this proof are that any rower, in all axes, than the PSF of the new image, then likelihood calculation for any generative model for the the Alard & Lupton (1998) family of methods are sta- differencebetweentheimagescanbecomputedbyusing ble. However, even in this case the solution found by only these quantities, and the use of the Fisher-Neyman these methods is sub-optimal (i.e., it does not maximize factorization theorem (Fisher 1922; Neyman 1935). We the S/N of the transients). We further demonstrate this notethatthereareinfinitenumberofsufficientstatistics point in §6. withrespecttotheimagesubtractionproblem(seesome examplesinZackay&Ofek2015binthecontextofcoad- 4.4. Locality dition). Here we prefer the proper subtraction image D Animportantpropertyofourimage-subtractionstatis- (rather than e.g., S) due to its useful properties. tic is its locality with respect to the input data. The The sufficiency property has important practical con- formulae for the image subtraction statistic are stated sequences. It means that, in the background-noise dom- in Fourier domain for simplicity and clarity. But, even inated limit, D and P contain all the information one D thoughoperationsdoneintheFourierdomainareglobal needs for any further measurement or hypothesis testing in nature, when calculating the final kernels that actu- relatedtothedifferencebetweentheimages. Thereisno allymultiplyR(cid:98)andN(cid:98),weseethattheirrepresentationin need for other types of difference images for other appli- thespatialdomainislocal,withpowervanishingquickly cations. Examples for practical applications include the awayfromtheorigin(i.e.,thePSFsareapproachingzero identificationandremovalofparticlehitsonthedetector atlargedistancefromtheirorigin). Thisisbecausethese (see§3.7); optimalsearchforpropermotion, astrometric operationsrepresentconvolutionwithafinite-sizekernel. shifts (§3.4) and asteroid streaks. Therefore, the proper subtraction statistic could be cal- culated independently for every arbitrarily small image 4.7. Symmetry between the new image and the reference patch (up to few times the PSF size), allowing the PSF image tovarysmoothlyacrosstheimage. Inaddition, localar- The problem of image subtraction is symmetric to the tifactssuchasbadpixels, particlehitsorsaturatedstars exchangeofthereferenceandthenewimage(uptonega- will affect only their close vicinity due to the locality of tionofthefluxofthetransient). Therefore,itisnotsur- the kernels used. prising that the optimal image subtraction statistics (D or S) are symmetric to the exchange of R and N (up to 4.5. The proper image subtraction D has white noise a minus sign). This property is violated by the solutions In the expression for D(cid:98) (Eq. 13), in the background- proposed by Phillips & Davis (1995), Alard & Lupton noise dominated limit, the variance of the numerator is (1998), and Bramich (2008). We note that the Gal-Yam equal to the square of the denominator, i.e: et al. (2008) method preserves this symmetry. Interest- ingly, the Gal-Yam et al. (2008) method is identical to V[FrP(cid:99)rN(cid:98) −FnP(cid:99)nR(cid:98)]=σr2Fn2|P(cid:99)n|2+σn2Fr2|P(cid:99)r|2, (46) the numerator of the proper image subtraction statistic Proper Image Subtraction 9 PSFs. This is relevant in rare cases of images that con- tain no point sources, for example, only galaxies. How- ever,inordertooptimallyfindtransientsintheimageall the methods requires the PSFs (see §4.8). In any case, in most observational situations the PSF is measurable frompointsourcesintheimageandthereforethisshould not be considered as a drawback. 4.11. Registration and color-refraction errors Image subtraction relies on many steps taken prior to the differencing process. Any noise introduced by the pre-processing steps will be propagated into the fi- nal subtraction image. Examples for such problems in- clude: registration errors, color-refraction systematic er- rors, and small-scale flat-fielding errors. Herewesuggesttwotypesoftreatmentsforsuchnoise: (1) It is straightforward to introduce these extra sources of noise into the variance image of S and use it to calcu- Figure 1. Pn (left column), Pr (middle column) and the corre- sponding PD (right column) for three cases. The first row is for lateScorr (see§6.1forexamples). Thiscorrectionissub- the case of symmetric Gaussian PSFs with sigma-width of 2 and optimal, but it is resilient to pre-processing errors. (2) 3pix for the new and reference, respectively. The second row is Anaccuratetreatmentoftheproblemistofitanyastro- forthecaseofa-symmetricGaussianPSFswithsigma-widthof2 metric shift and flux variation for each detected artifact by4pixand4by2pixforthenewandreference,respectively. In the third row Pn and Pr are simulated speckle images (using the in the difference image D (see §3.4). Albeit this is com- toolsinOfek2014). InthespecklesimulationswesetDtel/r0=20, putationallyexpensive,thiskindofsolutionisverycom- whereDtel isthetelescopediameterandr0 istheFriedlength. mon in astronomy (e.g., DAOPHOT, Stetson 1987; DOPHOT, (D). Schechter et al. 1993). In the future, it is possible that the use of these steps 4.8. The limit of noiseless reference image may enable us to remove completely the need for any post-subtraction transient identification using machine In the limit of σ →0 Equation 12 becomes r learningorhumanclassification. Wenotethatsuccessful lim S(cid:98)= FnFr2P(cid:99)n|P(cid:99)r|2N(cid:98) −FrFn2P(cid:99)r|P(cid:99)n|2R(cid:98) (47) iimngpolefmaellnttahteiosnouorfctehseosfenidoeisaes.requiresgoodunderstand- σr→0 σn2Fr2|P(cid:99)r|2 4.12. Free parameters = FnP(cid:99)n(cid:0)N(cid:98) − FnP(cid:99)nR(cid:98)(cid:17). (48) Inprinciple,ourmethoddoesnothaveanyfreeparam- σn2 FrP(cid:99)r eters that the user needs to set. We note that the Alard & Lupton (1998), Bramich (2008), and Yuan & Akerlof ThetermP(cid:99)n/P(cid:99)r canbeidentifiedastheconvolutionker- (2008)methodsdohaveinternaldegreesoffreedomthat nelsolvedforbythemethodsofPhillips&Davis(1995), the user needs to define and that may influence the fi- Alard & Lupton (1998) and Bramich (2008). Therefore, nal outcome. For example, the Bramich (2008) method in this limit, S converges to the Alard & Lupton (1998) may be sensitive to the kernel size, while the Alard & familyofmethodsfollowedbyfilteringeachoftheimages Lupton (1998) method depends on the basis functions with the PSF of the new image. one chooses to represent the convolution kernel (see e.g., This simple analysis demonstrates that the Alard & Becker et al. 2012 and Bramich et al. 2015). Lupton (1998) family of methods, if followed by the cor- rectmatchedfiltering, isaspecialcaseofoursolutionS. 4.13. Computational complexity Furthermore, Equation 48 provides the prescription for In terms of computational complexity, our subtraction the correct matched filter (only) in the limit of σr →0. method is fast, as the most demanding operation in our image subtraction method is the FFT operation (or al- 4.9. The PSF of the difference image ternatively convolution with a small kernel). Tests indi- The PSF, P , of the proper subtraction image is a catethatouralgorithmisatleastanorderofmagnitude D combinationofP andP . InFigure1wepresentP ,P faster than the inversion algorithms by Alard & Lupton n r n r and the corresponding P for three cases, of symmetric (1998) and Bramich (2008) as they are essentially solv- D Gaussians, a-symmetric Gaussians, and speckle images. ing a linear least square problem with a large number of equations and tens to hundreds of unknowns. 4.10. Knowledge of the PSFs 5. SUMMARYOFALGORITHM Anapparentdrawbackofourmethodisthatoneneeds We recommend to perform the subtraction on small to know the PSFs of the images, while in the Alard image patches in order to minimize residual astrometric & Lupton (1998) family of methods one simply solves shifts, inhomogeneous transparency and background. In for the convolution kernel P(cid:99)n/P(cid:99)r without measuring the addition, it allows to use position-dependent PSFs. The PSFs. imagepatchesshouldbeoverlappingbyatleasttwoPSF However, one can write the expression for D with lengths, in each dimension, in order to avoid edge effects P(cid:99)n/P(cid:99)r, allowingtoincorporaterelativeknowledgeofthe of the convolution process. 10 Zackay, Ofek and Gal-Yam A step-by-step outline of our algorithm is as follows: 12. Calculate k (Eq. 29). n Input arguments: 13. SetV((cid:15)n)=Nb+rn2 andcalculateV(SN)(Eq.26). N -backgroundsubtractednewimage(registeredtoR). R - background subtracted reference image. 14. Set V((cid:15)r)=Rb+rr2 and calculate V(SR) (Eq. 27). N - new image including background in electron units. If R is composed of multiple images, it is better b Rb - reference image including background in electron to sum up the V(SRj) of the individual reference units. images (see Appendix C and Eq. C7). P - PSF of new image normalized to have unit sum. n Pr - PSF of reference image normalized to have unit 15. Calculate Vast(SN) (Eqs. 30 and 31). sum. 16. Calculate V (S ) (Eqs. 32 and 33). σ - std of the background of the new image. ast R n σ - std of the background of the reference image. r 17. Calculate S (Eq. 25). As a sanity check, the corr r - read noise of new image in electrons. n (robust) std of S should be ≈1. corr r - read noise of reference image in electrons. r σx - rms (in pixels) of the astrometric registration 18. Search for local maxima in Scorr - the peak value solution in the X-axis. This is either a scalar or a corresponds to the significance of the transient in matrix. units of sigmas. σ - rms (in pixels) of the astrometric registration y solution in the Y-axis. This is either a scalar or a 19. Asanalternativetosteps15and16,wecansearch matrix. all locations in D that correspond to statistically significant sources in S (without astrometric corr Output: contributions) for moving point sources using PS D - The proper difference image. (Eq. 34), measure their flux and astrometric vari- P - The PSF of the proper difference image. ability and subtract them. D S - The matched filter difference image corrected for corr 20. Select remaining sources with significance larger source noise and astrometric noise. than some threshold, determined from the desired P - The PSF of a delta function in N as it appears in Dn false alarm probability. D. P -ThePSFofadeltafunctioninRasitappearsinD. Dr 21. Calculatethefluxofthetransientcandidatesusing Equations 41, 42, and 43. Algorithm: 6. TESTS 1. Optionallyconstructareferenceimage(R;Eq.22), There are several important challenges in testing any itsPSF(P ;Eq.23)andflux(F ;Eq.24)usingthe image differencing algorithm. Image subtraction in gen- r r Zackay & Ofek (2015b) proper coaddition method. eralisaffectedbymanyfactors. Therefore,itisdesirable to separate between external problems (e.g., non-perfect 2. Solve Equation 39 for the best-fit value of β and registration) and issues related to the subtraction itself optionallyγ, ∆x, and∆y (needtouseEqs.37and (e.g., numerical stability). Therefore, we are using both 38). Since this Equation is non-linear in β use it- simulations and real data to test our image differencing erations. Set β = 1 in the first iteration, update algorithm. the value of β and continue until convergence. Use It is worthwhile to compare the new algorithm with robust fitting16. existing methods. However, such a comparison is prob- lematic, as other methods do not specify the matched 3. Ifapplicablecalculateγ(Equation40)andsubtract filter for source detection. Furthermore, some of these γ from N. methods depend on the selection of basis functions and 4. If applicable, shift P by ∆x and ∆y. kernel size. In addition, there are several ways to solve n a system of linear equations (e.g., SVD) and these may 5. Set Fr =1 and Fn =β. influencethefinaloutcome. Therefore,hereourcompar- ison with other methods is limited. 6. Calculate D(cid:98) (Eq. 13). In§6.1wepresenttestsbasedonsimulateddata,while in §6.2 we discuss real images. The code we use is avail- 7. Calculate P(cid:99)D (Eq. 14). able as part of the Astronomy and Astrophysics package for MATLAB (Ofek 2014), described in §7. 8. Calculate S(cid:98)=P(cid:99)DD(cid:98). 6.1. Simulations 9. Calculate P(cid:100)Dn (Eqs. 18 and E13). An important feature of our algorithm is its numerical stability. The best way to test this is on simulations, as 10. Calculate P(cid:100)Dr (Eqs. 19 and E17). the input is fully controlled. 11. Calculate k (Eq. 28). We simulated images of 512 by 512 pixels size, with r background level of 300electrons, with Poisson noise. In 16 Robust fitting is less sensitive to outliers. An example for a each image we simulated 100 stars with integrated flux robustfitteristherobustfit.mfunctioninMATLAB. taken from a flat distribution between 0 to 105electrons