XXV European Cosmic Ray Symposium, Turin, Sept. 4-9 2016 1 Development of a Machine Learning Based Analysis Chain for the Measurement of Atmospheric Muon Spectra with IceCube Tomasz Fuchs for the IceCube Collaboration∗ Technical University Dortmund, Experimental Physics V, D-44221 Dortmund, Germany High-energymuonsfromairshowereventsdetectedinIceCubeareselectedusingstateoftheart machine learning algorithms. Attributes to distinguish a HE-muon event from the background of low-energy muon bundles are selected using the mRMR algorithm and the events are classified by a random forest model. In a subsequent analysis step the obtained sample is used to reconstruct the atmospheric muon energy spectrum, using the unfolding software TRUEE. The reconstructed spectrumcoversanenergyrangefrom104GeVto106GeV.Thegeneralanalysisschemeispresented, 7 including results using the first year of data taken with IceCube in its complete configuration with 1 86 instrumented strings. 0 2 n I. INTRODUCTION Eµ,HE describes the energy of the HE-muon and a Ebundle,tot. the total bundle energy including the J HE-muon. Since the muons within the bundle cannot IceCube is a cubic kilometer detector array located 5 be resolved spatially, high-energy muons can only be atthegeographicSouthPole. Its5160DigitalOptical 1 identified indirectly from the relation between catas- Modules(DOMs)areusedtodetectsecondarymuons produced either in neutrino interactions with ice or trophic and continuous energy losses, and separation ] M bedrock, or in cosmic ray air showers. The combi- from background is challenging. Our analysis chain has the following three parts: nation of the conventional (i.e. produced in pion and I a) the selection of event properties (hereafter referred . kaon decays) atmospheric and astropysical neutrino h to as attributes) to be used in the analysis, b) the flux dominates over the expected prompt component p classification of events into signal and background us- - which is produced mainly by charmed hadron decays o in the atmosphere. Therefore an accurate measure- ing these attributes, and c) the reconstruction of the r spectrum of the HE-muons at the surface. t ment of the prompt flux magnitude is a difficult task s foralarge-volumeneutrinodetector. Sincetheenergy a [ spectrum of atmospheric muons has no astrophysical component, this flux can be studied to determine the II. ATTRIBUTE SELECTION 1 magnitude of the prompt flux which is produced in v leptonic decays of charmed hadrons and unflavored Choosing the right attributes is a crucial step for 7 6 mesons. every analysis. For this proceeding every value which 0 In general high-energy muons from air showers ismeasuredorreconstructedforeverysingledetected 4 are accompanied by a bundle of low-energy muons. orsimulatedeventisreferredtoasanattribute. From 0 Therefore, the detection of HE-muons within a muon the pool of all available attributes we select algorith- . 1 bundle is a challenging task as the IceCube detec- mically those that agree between data and simulation 0 tor lacks the spatial resolution to resolve individual using a comparison of the integrated distributions. 7 muons within a muon bundle. To analyze the atmo- This is also a criterion in the attribute selection since 1 sphericmuonswedevelopedadataminingbasedanal- only well simulated attributes can be used to draw : v ysis scheme that selects algorithmically from a large conclusions for measured events. For following steps i pool of reconstructed event properties the most pow- it is common to manually select attributes based on X erful ones for distinguishing signal and background. experienceandexpertknowledge. Inthisanalysisthe r a A similar approach was used to analyze atmospheric large dimensionality of the set of attributes renders neutrinos[1] in IceCube. this procedure impossible. The developed analysis chain was used to recon- An algorithmic approach like the mRMR struct the spectrum of high-energy muons reaching algorithm[2] is necessary. In the mRMR algo- the IceCube in-ice detector. For this analysis a HE- rithm the relevance D and the redundancy R of a muon is defined as a muon which fulfills the following specific set of attributes is analyzed. In a specific condition attributesettherelevanceD describesthecorrelation of each attribute to the signal and background events E 0.5E . (1) whereas the redundancy R quantifies the correlation µ,HE bundle,tot. ≥ between the attributes in this set. In the mRMR algorithm a maximization of the equation ∗http://icecube.wisc.edu Φ(D,R)=D R (2) − eConf TBA 2 XXV European Cosmic Ray Symposium, Turin, Sept. 4-9 2016 is performed. The attribute set is built up iteratively To rule out a possible mismatch between data and byaddingasingleattributewhichmaximizesΦ(D,R) simulation the score distribution on simulated events in equation 2 in each step. The advantage of this is compared to data events. In Figure 2 an agree- procedure is the possibility to determine the best at- ment within 20% between simulated and analyzed tributes which are able to classify the data for any data events is seen. Since the data used to train the size of the attribute set. In this analysis the final at- model is limited there is an uncertainty for the score tribute set uses 30 attributes which proved to be a distributions as well as the efficiency and purity. The good tradeoff between execution time and separation uncertainty for the score, efficiency and purity are es- quality. timated using a 5-fold cross validation. In a cross validation the training set is split into n parts. The model is trained on n 1 parts and applied on the III. EVENT CLASSIFICATION left-out part. This pro−cedure is repeated until every part has been used as a test set. The variance of per- With the selected set of attributes a classification formance values of the multiple classifications is used of events into signal and background has to be done. to estimate the uncertainty. This can be achieved by splitting the event sample at a certain value of a certain attribute (a so called ”straightcut”),butalgorithmsfromcomputerscience like deep neural networks[4], boosted decision trees[5] or random forests[6] are increasingly used and show comparable or superior performance. Here the ran- dom forest algorithm is used to perform the selection ofHE-muons. Arandomforestisanensembleofsingle decisiontreeswhichselectrandomattributesatevery knot and is proven to be more robust against over- training. Inthecaseofatwo-classclassificationprob- lem the random forest determines a score which indi- cates the most likely class affiliation. Here the score isusedtodistinguishbetweensignal(HE-muons)and background events (low-energy muon bundles). Be- cause the random forest builds multiple random deci- sion trees, its score can simply be calculated by the fraction of single trees which classified an event to be FIG. 1: Random Forest score for the training and the signal T over the total number of built trees T. + test set for the separation of HE-muons and events S =T /T (3) without high-energy muons. Additionally the chosen + score limit of 0.9 is shown. To avoid a misleading classification result by an over- trainedmodelthedatatobuildthemodelissplitinto atrainingandatestsetwithrandomlychosenevents. The results of the model trained on the training set applied to the test set can be seen in Figure 1. Both classes can be distinguished in the score distribution, soitcanbeusedtoseparatetheHE-muonsfromback- ground events using a straight cut in the score value. Toevaluatetheoptimalcutvalues,differentperfor- manceindicatorsareused. Themostcommonindica- torsaretheefficiencyandthepurityofapotentialset of events. The efficiency describes the fraction of sig- nal events above a specific score value. The purity is theratioofsignaleventstothetotalnumberofevents above a specific score value. The requirements on the purityandefficiencyareproblemdependentso thata general approach is not available. In this analysis a FIG. 2: Comparison of the random forest score score cut of S 0.9 is used which results in an effi- between simulation and data observed with IceCube ciencyof(40.8 ≥0.6)%andapurityof(93.1 0.4)%. (10% of available statistics for IC86-I). Also the cho- ± ± sen score limit of 0.9 and the deviation between the The score cut was chosen because for higher cut val- simulated events and the measured data is shown. ues the efficiency of the resulting sample drops signif- icantly. eConf TBA XXV European Cosmic Ray Symposium, Turin, Sept. 4-9 2016 3 6.5 method is based on a Fredholm integral of the second kind which can be discretized to 102 6.0 (cid:126)g =Af(cid:126)+(cid:126)b. (4) 101 V) The sought after distribution f(cid:126) is folded with a re- e5.5 G 100 sponse matrix A. The response matrix A describes / µ the expected distribution of a specific attribute (cid:126)g for E g(105.0 10−1 each bin in surface muon energy. A background con- o tribution(cid:126)b has to be considered in addition to obtain l 10 2 theexpecteddistributionoff(cid:126). ThematrixAisdeter- 4.5 − minedfromsimulatedeventsandthereconstructionof f(cid:126)is done using a regularized[7] likelihood fit with the 10 3 − software TRUEE[8]. In the case of HE-muons two at- 4.0 0.5 0.0 0.5 1.0 1.5 2.0 2.5 tributes are used for the reconstruction of the surface − log10(dE/dxSplineMPEBINS/(GeVm−1)) energy. Thissurfaceenergyofthemuonisdetermined by the reconstructed energy losses of the muon in the FIG. 3: Correlation between the surface muon energy and detectorandtheestimatedenergylossesoccuringdur- the energy estimator. For each energy estimator bin the ing the propagation from the surface to the detector. mean and the standard deviation of the surface energy are To account for their energy loss prior to reaching Ice- shown. Cube, the distance from the surface to the detector 6.0 103 boundary is used since no measurement can be per- formed outside the detector. The correlation between 102 the surface energy E and the reconstructed energy µ 5.5 loss dE/dx and the distance from the sur- V) 101 face DepthSpolfintehMPeEBeIvNSents are shown in the Figures 3 e and 4. G /5.0 100 Using these attributes a surface energy spectrum µ E of the HE-muons can be determined and is shown in ( g10 10−1 Figure 5. To enhance the visibility of spectral fea- o l tures of the reconstructed flux the result is multiplied 4.5 10 2 by (E /GeV)3. The blue lines in Figure 5 are theo- − µ retic predictions for the flux and are split into a con- 10 3 ventional part (using SIBYLL-2.1[10] as the hadronic − 4.0 interaction model and GaisserH3a[11] as the cosmic- 2 4 6 8 10 12 14 rayspectrum)andapromptcontributionfromcharm Depth/km (ERS[12]) and unflavored (Unflavored[13]) particles. A previous analysis without a machine learning ap- FIG. 4: Correlation between the surface muon energy and proachpublishedin[9]isshownwithitsuncertainties thepropagateddistanceintheice. Foreachdistancebinthe asagrayband. Theresultsofthisanalysisareplotted mean and the standard deviation of the surface energy are shown. as black circles. The uncertainty of this analysis has two contribu- tions. A smaller one from the limited statistics of ob- IV. UNFOLDING servedeventsandalargeronefromthelimitedstatis- ticsofsimulatedsignalevents. Toestimatetheuncer- The muon energy at the surface cannot be mea- taintyfromthelimitedsimulationeventsthespectrum sureddirectly. Itsdistributionhastobereconstructed wasreconstructedusingmultipleresampledsubsetsof from observables that can be measured at the detec- the total set. The variance of the spectra was then tor. A common method is the forward folding which considered in the estimation of the uncertainty. This fits the expected event properties to the data vary- yieldsalargeuncertainty,especiallyforthehigheren- ing parameters of a model of the muon energy spec- ergies (Eµ > 105 GeV). This is due to the fact that trum at the surface. This method has the disadvan- events with more energy are far less often simulated tagethattheresultingspectralshapeislimitedtothe due to the assumed E−2 cosmic ray spectrum in the chosen parametrization. An alternative approach to simulation. this is the unfolding method which estimates the sur- facemuonspectrumdirectlybyinvertingtheresponse functionswhichdescribetheexpectedeventproperties foreachvalueofthesurfaceenergyofthemuon. This eConf TBA 4 XXV European Cosmic Ray Symposium, Turin, Sept. 4-9 2016 -1 10 1 IceCube Preliminary − ) V e G r s s 2 m c 10-2 ( / 3 ) V e G / SIBYLL-2.1 (GaisserH3a) Eµ ERS (× Unflavored µ -3 SIBYLL-2.1 + ERS + Unflavored Φ 10 Astropart. Phys. 78 (2016) 1-27 IC86-I+σstat.+σMC 5 6 10 10 E / GeV µ FIG. 5: Unfolded muon spectrum multiplied with E3 using the IC86-I data. The reconstructed muon spectrum from AP.78 is shown in gray. In blue the theoretic prediction for a conventional part (SIBYLL-2.1[10]) with the assumed GaisserH3a[11]cosmicrayfluxmodelinbracketsareshownandathecontributionbyapromptcomponentfromcharm (ERS[12]) and unflavored (Unflavored[13]) particles. V. CONCLUSION AND FUTURE ofsystematiceffects. Thestatisticalsignificanceisnot PERSPECTIVES sufficienttodeterminewhetherapromptcontribution is present on the predicted level in the resulting spec- The described machine learning based analysis trum. chainprovedthatitcanbeusedtoseperateHE-muons Sincetheuncertaintyisduetothelackofsimulated frombackgroundeventsandunfoldtheirenergyspec- events, improvements of the final result are expected trum. The reconstructed spectrum of HE-muons is with growing numbers of simulations. The available compatible with theoretical predictions and with pre- simulation statistics for the years 2012 and above is 7 vious results from the IceCube collaboration. This times larger than in this analysis. This will decrease spectrum does not include systematic effects com- the statistical uncertainty of the simulation and ex- paredtothepublishedresultsshowninthegrayband. tend the results of this analysis to higher energies. Thereforeitisexpectedthattheuncertaintyofthere- constructed spectrum will increase with the inclusion [1] M.G.Aartsenet al.,TheEuropeanPhysicalJournal [6] L. Breiman, Machine Learning 45 (2001) C 75 (2015) [7] A.N.Tikhonov,Dokl.Akad.NaukSSSR151(1963) [2] C.Dingetal.,IEEETransactionsonPatternAnalysis [8] N. Milke et al., NIM A 697 (2013) and Machine Intelligence 27 (2016) [9] M.G. Aartsen et al., Astropart. Phys. 78 (2016) [3] L. Breiman, Machine Learning 45 (2001) [10] A. Fedynitch et al., Phys. Rev. D 86 (2012) [4] Y. Bengio and Samy Bengio, Advances in Neural In- [11] T.K. Gaisser, Astropart. Phys. 35 (2012) formation Processing Systems 12 (200) [12] R. Enberg et al., Phys. Rev. D 78 (2008) [5] Y.FreudandR.E.Schapire,MachineLearning: Pro- [13] A. Fedynitch et al., PoS(ICRC2015) 558 (2015). ceedings of the Thirteenth International Conference (1996) eConf TBA