NuclearPhysicsB Proceedings Supplement NuclearPhysicsBProceedingsSupplement00(2015)1–5 Data processing and storage in the Daya Bay Reactor Antineutrino Experiment MiaoHe InstituteofHighEnergyPhysics,Beijing 5 OnbehalfoftheDayaBaycollaboration 1 0 2 n a J Abstract 8 2 The Daya Bay Reactor Antineutrino Experiment reported the first observation of the non-zero neutrino mixing angleθ13 usingthefirst55daysofdata. Ithasalsoprovidedthemostprecisemeasurementofθ13 withtheextended ] datato621days. DayaBaywillkeeprunningforanother3yearsorso. Thereisabout100TBrawdataproduced t e peryear,aswellasseveralcopiesofreconstructiondatawithsimilarvolumetotherawdataforeachcopy. Theraw d dataistransferredtoDayaBayonsiteandtwooffsiteclusters: IHEPinBeijingandLBNLinCalifornia,withashort - s latency. There is quasi-real-time data processing at both onsite and offsite clusters, for the purpose of data quality n monitoring, detector calibration and preliminary data analyses. The physics data production took place a couple of i . timesperyearaccordingtothephysicsanalysisplan. Thispaperwillintroducethedatamovementandstorage,data s c processingandmonitoring,andtheautomationofthecalibration. i s Keywords: DayaBay,reactor,neutrino,dataprocessing y h p [ 1. Introduction periments,θ canbeextractedfromthesurvivalprob- 13 abilityoftheelectronantineutrinoatadistanceof1∼2 1 Neutrinos are elementary particles in the Standard v kmfromthereactors Model. There are three flavors of neutrinos, known 9 6 as νe, νµ and ντ. Neutrinos exist in the nuclear fusion P(νe →νe)≈1−sin22θ13sin2(1.27∆m231L/E) (1) 9 in the sun, β-decays of radioactivities inside the Earth, where E is the ν energy in MeV and L is the distance 6 the Big Bang remnant, the supernova explosion, or in- e 0 teractions between cosmic rays and the atmosphere of inmetersbetweentheνe sourceandthedetector(base- . line). 1 the Earth. Besides the natural sources, neutrinos can 0 be also produced artificially in reactors and accelera- 5 tors. The neutrino flavor states are superpositions of 2. TheDayaBayexperiment 1 three mass eigenstates (ν , ν and ν ). One flavor can : 1 2 3 v changetoanotherbecauseofthequantuminterference The Daya Bay experiment was designed to provide Xi of the mass eigenstates during the traveling. This phe- themostprecisemeasurementofθ13amongexistingand r nomenon is known as neutrino oscillation or neutrino nearfutureexperiments,withasensitivitytosin22θ13 < a mixing. The amplitude of the oscillation is connected 0.01 at the 90% CL [1]. The experiment is located to the mixing angles θ , θ and θ . The oscillation nearDayaBay,LingAoandLingAo-IInuclearpower 12 23 13 frequenciesaredeterminedbythedifferenceofsquared plantsinsouthernChina,45kmfromShenzhencity.As neutrinomasses,∆m2 =m2−m2. Forreactor-basedex- showninFig.1,therearesixfunctionallyidenticalreac- ij i j torcores, groupedintothreepairs. Threeunderground experimentalhalls(EHs)areconnectedwithhorizontal Emailaddress:[email protected](MiaoHe) tunnels. Twoantineutrinodetectors(ADs)werelocated /NuclearPhysicsBProceedingsSupplement00(2015)1–5 2 in the Daya Bay near hall (EH1), two in the Ling Ao wascontinuouslyupdatedwithincreasedstatistics. The and Ling Ao-II near hall (EH2), and four near the os- latestresultofDayaBaywasbasedon621daysofdata cillation maximum in the far hall (EH3). Each AD is withaprecisionof5.6%[5]. filled with 20 t of gadolinium-doped liquid scintillator (Gd-LS) as a target, surrounded by 20 t un-doped liq- 3. DataprocessingandstorageinDayaBay uid scintillator (LS), for detecting gammas that escape fromthetarget. Theν interactswithprotonviathein- e Fig.2isaglobalpictureofDayaBaydataprocessing. verse β-decay (IBD) reaction in Gd-LS, and releases a Raw data is firstly transferred to the onsite farm then e+ andaneutron. Thepromptsignalfrome+ ionization to Insititute of High Energy Physics (IHEP) in Beijing andthedelayedsignalfromneutroncaptureonGdwith andLawrenceBerkeleyNationalLaboratory(LBNL)in ∼30 µs latency provides a time coincidence signature. Californiaascentralstorageandprocessing. Themeta- Themuonvetosystemineachhallconsistsofaninner dataintheonlinedatabaseisalsotransferredtotheof- watershield,anouterwatershieldandaresistiveplate flinedatabase. APerformanceQualityMonitoringsys- chamber (RPC). The detailed introduction to the Daya tem (PQM) [6] is running onsite using fast reconstruc- BayexperimentcanbefoundinRef.[2]. tionalgorithmstomonitorthephysicsperformancewith alatencyofaround40minutes.Akeep-updataprocess- ingtakesplaceassoonasthedatareachIHEPorLBNL, using the full reconstruction and the latest calibration constants. Thegenerateddetectormonitoringplotsare published through an Offline Data Monitoring system (ODM)withalatencyofaround3hours. Theextracted dataqualityinformationisfilledinadedicateddatabase forthelong-termmonitoring. Figure1:LayoutoftheDayaBayexperiment. DayaBaystarteddatatakingintheendof2011with six ADs. The last two ADs were installed in summer 2012. Multiple partitions in the data acquisition soft- Figure2:OverviewofthedataprocessinginDayaBay. ware (DAQ) [3] allow data taking in three experiment halls simultaneously and the output of multiply data 3.1. Datamovementandstorage streams. Atypicalphysicsrunineachhalllastsfor48 hours, withapedestalrunandanelectronicsdiagnosis ThedatamovementiscontrolledbyasetofJavaap- run in between. The typical trigger rates are 1.3 kHz, plications, and the primary application is call SPADE. 1.0kHz,0.6kHzinEH1,EH2,EH3,respectively. The ArawdatafileiscopiedtoanoffinefileserveratDaya total number of raw data files each day is about 320, Bay onsite automatically once it’s closed and recorded approximately 1 GB volume per file. The ratio of the in the online database by DAQ. When the movement datatakingtimein2013islargerthan97%,andofthe is done, the file status in both of the online and offline physics data taking time is larger than 95%. Based on databaseischangedtoTRANSFERREDbySPADEso thefirst55daysofdata,DayaBayreportedthefirstob- that this file can be deleted from the online disk. Af- servationofthenon-zeroneutrinomixingangleθ with terthat,thefileistransferredtotwocomputingcenters 13 5.2 standard deviations [4]. The precision of sin22θ IHEPandLBNLsequentiallywithamaximumlatency 13 /NuclearPhysicsBProceedingsSupplement00(2015)1–5 3 of 20 minutes. SPADE also creates a xml file which slavedatabases,asshowninFig.3. Duringdatataking, containsdescriptionsofrawdataforthefilecataloging. theinformationsuchasthedescriptionofrawdataand Multiplytoolsaredeployedtomonitorthenetworktraf- thedetectorstatusisextractedfromonlinedatabasesto ficatDayaBayonsiteandtomonitortherateoftrans- theonsiteofflinedatabaseautomaticallywithascraper. ferredrawdatafilenumberanddatavolumefromDaya Then the onsite offline database is synchronized to the BaytoIHEP,andfromIHEPtoLBNL. central master database at IHEP. The central database RawdataisstoredatDayaBayonsiteforonemonth alsocontainsthecalculatedreactorneutrinofluxbased andarchivedatIHEPandLBNL,withonecopyondisk onthereactorinformationprovidedbythepowercom- andtwocopiesontapeforeachofthem.Theinfrastruc- pany. There are multiple slaves as replications of the tureofthesetwoclustersaredifferentbutthecapability central database located at different institutes. Offline ofcomputingandstoragearesimilar. Eachclusterhas calibration constants produced by calibration experts near1PBdiskspaceandaround800CPUcoresinclud- are firstly filled in a temporary table of a local slave, ing some shared resources. Additional computing re- then dumped into a formative text file after validation. sources have been planned to accommodate increasing The text file is archived in the subversion system. The data. Accumulatedrawdatais320TBbythemiddleof database manager is responsible to fill the calibration 2014. constants into the central database through the text file witha revisionnumberspecifiedby thecalibrationex- 3.2. Offlinesoftware pert. This standard operating procedure makes sure theupdatingofcalibrationconstantshighlycontrolled, The Gaudi [7] framework based Daya Bay offline carefullyrecordedandeasilyreproducible. software (known as NuWa) provides the full function- alityrequiredbysimulation,reconstructionandphysics analysis. NuWaemploysGaudi’seventdataserviceas the data manager. Raw data, as well as other offline dataobjects, canbeaccessedfromtheTransientEvent Store (TES). The prompt-delayed coincidence analysis requireslooking-backintime,whichisfulfilledviaspe- cific implemented Archive Event Store (AES). All the dataobjectsinbothTESandAEScanbewrittenintoor read back from ROOT [8] files through various Gaudi converters.AnalternativeLightweightAnalysisFrame- work (LAF) was designed and implemented by Daya Bay to improve the analysis efficiency. LAF is com- patiblewithNuWadataobjectswithhigherI/Operfor- mance by the simpler data conversion, the implemen- Figure 3: Standard operating procedures of the Daya Bay offline tation of lazy loading, and the flexible cycling mecha- database. nism. LAFallowstoaccesseventsbothbackwardsand forwards through an data buffer, which also serves to exchange data among multiple analysis modules. The 3.4. Calibrationandreconstruction NuWaandLAFpackagesareavailabletocollaborators Various of sources are used to calibrate the detector usingSubversion(SVN)codemanagementsystem[9]. response. Specific calibrations runs are taken weekly The NuWa auto-build system was implemented using using LED and radioactive sources. Other calibration the Bitten [10] plug-in for trac [11], which is an en- samples such as PMT dark noises and neutrons pro- hancedwikiandissuetrackingsystemforsoftwarede- duced by cosmic rays are selected in normal physics velopment projects. Multiple servers at Daya Bay on- runs. There are two categories of the calibration. The site, IHEP, LBNL and other institutes work as the Bit- file-by-file track automatically accumulates calibration tenslaveandbuildNuWaautomaticallywhenthecode samples from physics runs and generates calibration isupdated. constants. The run-by-run track uses a script to search all raw data from calibration runs and submit jobs to 3.3. Database get calibration constants. Reconstruction algorithms The offline database consists of an onsite offline readcalibrationconstantsfromthelocalslavedatabase database,thecentralmasterdatabaseandmultiplelocal throughaDatabaseInterface(DBI).Atimestampcalled /NuclearPhysicsBProceedingsSupplement00(2015)1–5 4 rollbackdateissettochoosethelatestrecordsthatwere analysis,suchasthecomparisontoreferences,thedis- inserted into the offline database before it, for the pur- playofdetectorconfigurationandstatus,andinterfaces poseoftheversioncontrol. of online and offline databases. Two example plots on ODMareshowninFig.5. Thepromptanddelayedsig- 3.5. Onsitedataprocessingandmonitoring nalsfromtheinverseβ-decayareselectedbasedonAES The PQM system running at Daya Bay onsite pro- duringKUP.TheKUPjobshavethehighestpriorityin vides a quasi realtime data processing and monitoring the job system, which typically occupy 30 CPU cores. during data taking. There are 16 dedicated CPU cores ThelatencyofODMisdominatedbythereconstruction for PQM and 40 additional cores shared with users. A andanalysisalgorithms. Python control script has been developed and runs in backgroundmodeonadedicatedserver,whichqueries the onsite offline database at a fixed time interval (10 seconds). If a new record is found and the data status returnedbythequeryshowsthatthecorrespondingraw datafileisalreadyonthefileserver,ajobforprocessing the new file is submitted to the Portable Batch System (PBS). The PQM job uses a recently manually tagged NuWaversiontoreconstructdatawiththelatestcalibra- tionconstantsintheofflinedatabase,runsanalysisalgo- rithms,andfillstheresultsintouser-definedhistograms. All of the histograms are dumped into a ROOT file at the finalization step of the algorithms, which is then merged with the accumulated ROOT file for the same run. After that, a C++ histogram printer is executed, taking the merged ROOT file as input to print selected histograms into figures. The control script also checks ifthejobisfinishedat10-secondintervalsbydetecting anemptyTXTfilecreatedattheendofthejob. Forfin- ishedjobs,thecontrolscripttransfersthecorresponding figurestothePQMdiskforwebdisplay. Whentherun ends and all the corresponding raw data files are pro- cessed,thecontrolscriptsavestheaccumulatedROOT file on the PQM disk for permanent storage. The data flow of PQM in shown in Fig. 4. The reconstruction and analysis modules in PQM is configurable, and the default configuration takes around30 minutes for each job. Figure5:Monitoringoftheenergyspectraofprompt(up)anddelayed 3.6. Offsitedataprocessingandmonitoring (down)signalsfromtheinverseβ-decayonODM. As soon as a raw data file reaches IHEP or LBNL, an application called ingest in the data movement sys- Physicsproductionusesthevalidatedandfrozencal- tem submits a job using the full NuWa reconstruction ibrationconstantsandreconstructionalgorithms. Italso andthelatestcalibrationconstants,knownasthekeep- contains event tagging and filtering and provides both up (KUP) data processing. More analysis modules are thefulldatasampleanddifferentreducedsamplestoim- running allowing a high-level monitoring of data. The provetheefficiencyofdataanalysis. Thereconstructed outputhistogramscategorizedbydifferentdetectorsand data volume is about 1.2 times of raw data, and only by different channels are archived in ROOT files and the latest version is kept on disk, while previous ver- printedasfigures. Amultifunctionalwebpagehasbeen sions that used for the publication were archived on developed in the framework of Django, named as the tape. Physics production takes place one or two times Offline Data Monitor (ODM). The major function of per year according to the requirement of the physics ODM is to display the plots produced in KUP. It pro- analysis, with about one month of processing time for vides additional tools to help the data monitoring and each production. A special production strategy was /NuclearPhysicsBProceedingsSupplement00(2015)1–5 5 Figure4:ThedataflowofthePQMjob. usedforthefirsttwopublications[4,12]toquicklyget 4. Summary thephysicsresult. Theofflinesoftwarewasfixedinthe beginning. Thecalibrationconstantsweregeneratedaf- DayaBayisareactorantineutrinooscillationexper- ter the weekly calibration data taking, followed by the iment. It provided the first and most precision mea- extension of the physics data production for the past surement of the neutrino mixing angle θ13. The offline week. Thankstothisweeklyproductionstrategy,Daya systemprovidedquickandstabledatatransferring,data Bay finished the analysis and reported the first physics processing and data quality monitoring. Data Bay will resultonly20daysafterthefirststageofdatataking. continuetakingdatatilltheendof2017. Thecomput- ingresourcesandtheofflinesoftwarewillbeupgraded toaccommodatethedataprocessinganddatastorage. 3.7. Dataquality References During the keep-up data processing, information re- [1] X. Guo, et al., A precision measurement of the neutrino latedtothedataqualityisextractedandfilledinaded- mixing angle θ13 using reactor antineutrinos at Daya-Bay, icated database, which is automatically checked by an (2007)arXiv:hep-ex/0701029. applicationfortheevaluationofthedataquality. Onthe [2] F. An, et al., A side-by-side comparison of Daya Bay an- tineutrino detectors, Nucl.Instrum.Meth. A685 (2012) 78–97. otherhand,onlineshiftersrecordanyissueofdatainthe arXiv:1202.6181,doi:10.1016/j.nima.2012.05.030. database.Adataqualitymanagerisresponsibletomake [3] F.Li, X.-L.Ji, X.-N.Li, K.Zhu, DAQarchitecturedesignof atagtoeachrawdatafilebasedonbothauto-checkand DayaBayreactorneutrinoexperiment,IEEETrans.Nucl.Sci.58 manual-checkresults,andprovidesapreliminarygood (2011)1723–1727.doi:10.1109/TNS.2011.2158657. [4] F.An,etal.,Observationofelectron-antineutrinodisappearance filelist.Afterthephysicsdataproduction,thefinalgood atDayaBay,Phys.Rev.Lett.108(2012)171803.arXiv:1203. filelistwhichisusedforthephysicspublicationiscom- 1669,doi:10.1103/PhysRevLett.108.171803. piledbasedonthefeedbackfromanalyzers. Awebin- [5] C.Zhang,RecentResultsFromTheDayaBayExperiment. terface of the data quality database is implemented for URLhttp://neutrino2014.bu.edu/ [6] Y.-B.Liu,M.He,B.-J.Liu,M.Wang,Q.-M.Ma,etal.,Onsite thelong-termmonitoringofthedataquality. Anexam- dataprocessingandmonitoringfortheDayaBayExperiment, pleisFig.6whichshowsastripchartoftheantineutrino Chin.Phys.C38(2014)086001. arXiv:1406.2104,doi:10. candidaterateinoneoftheADinEH1. 1088/1674-1137/38/8/086001. [7] [link]. URLhttp://cern.ch/gaudi/ [8] [link]. URLhttp://root.cern.ch/drupal/ [9] [link]. URLhttp://subversion.apache.org/ [10] [link]. URLhttp://bitten.edgewall.org/ [11] [link]. URLhttp://trac.edgewall.org/ [12] F. An, et al., Improved Measurement of Electron Antineu- trino Disappearance at Daya Bay, Chin.Phys. C37 (2013) Figure6:Monitoringoftheantineutrinocandidaterate. 011001. arXiv:1210.6327,doi:10.1088/1674-1137/37/ 1/011001.