ebook img

The Challenge of Data Reduction for Multiple Instruments on Stratospheric Observatory For Infrared Astronomy (SOFIA) PDF

0.07 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The Challenge of Data Reduction for Multiple Instruments on Stratospheric Observatory For Infrared Astronomy (SOFIA)

**VolumeTitle** ASPConferenceSeries,Vol.**VolumeNumber** **Author** (cid:13)c**CopyrightYear**AstronomicalSocietyofthePacific The ChallengeofData Reduction forMultiple Instruments on Stratospheric Observatory ForInfrared Astronomy (SOFIA) 1 1 0 Charcos-Llorens M.V.1;KrzaczekR.2;ShupingR.Y.3;andLinL.1 2 1Universities SpaceResearchAssociation, NASAAmesResearchCenter, n MoffettField,CA94035, USA a J 2ChesterF.CarlsonCenterforImagingScience, RIT,54LombMemorial 1 Drive,RochesterNY14623, USA 1 3SpaceScience Institute, 4750WalnutStreet,Boulder,Colorado80301, USA ] M Abstract. SOFIA,theStratosphericObservatoryForInfraredAstronomy,presents I . a number of interesting challenges for the development of a data reduction environ- h mentwhich,atitsinitialphase,willhavetoincorporatepipelinesfromsevendifferent p instrumentsdevelopedbyorganizationsaroundtheworld. Therefore,theSOFIAdata - o reductionsoftwaremustruncodewhichhasbeendevelopedinavarietyofdissimilar r environments,e.g.,IDL,Python,Java,C++. Moreover,weanticipatethisdiversitywill t s onlyincreaseinfuturegenerationsofinstrumentation. Weinvestigatedthreedistinctly a different situations for performing pipelined data reduction in SOFIA: (1) automated [ data reductionafter data archivalat the end of a mission, (2) re-pipeliningof science 1 data with updated calibrations or optimum parameters, and (3) the interactive user- v drivenlocalexecutionandanalysisofdatareductionbyaninvestigator.Thesedifferent 0 modes would traditionally result in very different software implementations of algo- 0 rithms used by each instrument team, in effect tripling the amount of data reduction 0 softwarethatwouldneedtobemaintainedbySOFIA. 2 We presenthereauniqueapproachforenfoldingalltheinstrument-specificdata 1. reductionsoftwareintheobservatoryframeworkandverifiestheneedsforallthreere- 0 ductionscenariosaswellasthestandardvisualizationtools.TheSOFIAdatareduction 1 structurewouldhostthedifferentalgorithmsandtechniquesthattheinstrumentteams 1 develop in their own programming language and operating system. Ideally, duplica- : tionofsoftwareisminimizedacrossthesystembecauseinstrumentteamscandrawon v softwaresolutionsandtechniquespreviouslydeliveredtoSOFIAbyotherinstruments. i X Withthisapproach,weminimizetheeffortforanalyzinganddevelopingnewsoftware reduction pipelines for future generation instruments. We also explore the potential r a benefits of this approach in the portability of the software to an ever-broadeningsci- ence audience, as well as its ability to ease the use of distributed processing for data reductionpipelines. 1. Introduction SOFIA is an airborne observatory designed primarily to carry out observations at in- frared and sub-millimeter wavelengths that cannot be carried out from ground-based facilities. SOFIA will host a variety of instruments observing in wavelength ranges from 0.3 to 600 microns which will be upgraded over time. This will produce a large 1 2 Charcos-LlorensM.V.;KrzaczekR.;ShupingR.Y.;andLinL. diversityofdatatypeswhichwilllikelyincreaseasnewgenerations ofinstruments are operated. The SOFIA Data Cycle System (DCS)1 is a collection of tools and services that support both the General Investigator (GI) and the Science and Mission Operations staff from observation and mission planning, through observation execution on-board the aircraft, to data archiving and processing post-flight and distribution to the GI and thescientificcommunity. TheDCSwillprovideauniform, extensible andsupportable frameworkforallaspectsofthisdatacycle. TheDCSwillsupport dataprocessing forbothfacility andPrincipal Investigator- classinstruments,includingarchivingandpipeliningofraw(Level1),processed(Level 2),fluxcalibrated(Level3),andhigherleveldataproducts(e.g. mosaicsandsourcecat- alogs). Dataprocessingincludesallstepsrequiredtoobtaingoodqualityfluxcalibrated dataforspectroscopy, imaging, fast-acquisition, polarimetry, etc. Processing eachdata type requires a sequence of unique or common algorithms with specific parameters to betuned. TheDCSwillincorporate, improveandmaintain thesealgorithms whichare provided bytheinstrument teamsanddeveloped inavariety ofenvironments. Inaddi- tion,thesealgorithmsmayrequireuser-interaction orfinetuningofinputparametersin ordertoreturngoodquality data 2. ConceptsandAssociations The DCS uses an Astronomical Observation Request (AOR) concept to collect up all needed information required to carry out an observation. AORs are produced by the GI and SMO staff during the observation planning stage and then passed to the SI during flight for execution. In addition, the AOR is the link between science and cal- ibration data of the same observation type and defines the parameters necessary for post-processing. Therefore, it will identify the reduction pipeline and its parameters. Foreachlevel2product, thePipelinePedigree(PP)recordsthepipelinegenerating the data,theparameters,theprocessingdateandthedatainvolvedintheprocess. AORand PPconceptshasbeenimplementedandareoperativeinDCS.Asimilarconceptwillbe necessarytotrackcalibrationactivities. DCSwillincludeaFluxCalibrationParameter (FCP) which will support the calibration process in order to document and reproduce thesameresultsasneeded. AOR,PPandFCParecharacterizedbyuniquekeynumbers thatidentify themaswellasinformation aboutthedatainvolved intheprocess. 3. Architecture TheDCSwillprovide theframeworkforbothautomatic pipelining andhuman-in-the- loop processing. TheDCSwillhost automatic pipelining at End-of-flight (EoF), user- initiated pipelining, and user-interactive processing and analysis. Figure 1 illustrates how these scenarios relate to each other. The CORE is in charge of data processing withinDCS(greenactors). Usercanperform dataprocessing outsideDCS(redactors) and use DCS tools to extract and archive data. We show data fluxes as dashed arrows andprocessrequestsasplainarrows. Weexplainbelowthefourmainscenariosdefining thedataprocessing scenarios illustrated inFig1: 1seehttp://dcs.sofia.usra.edu TheChallengeofDataReductionforMultipleInstrumentsonSOFIA 3 Figure1. Dataprocessingscenarios. • EoFautomaticpipeliningproducingimmediateLevel2products[greenarea] Flightdata,asforexamplerawobservationsorflight-processedproducts,areingestedat EoF.DCScallspipelinesautomaticallyafteringestionofdataobservedduringflightop- eration. Productsfromdatareduction,typicallylevel2data,areautomaticallyarchived asthedataareprocessed,makingthemquicklyavailableforscientificanalysis. • FluxcalibrationproducingLevel3products[purplearea] Outstandingscientificresultscanbeobtainedonlywithfluxcalibrateddata.Fluxcalibra- tionisacomplicatedprocessesthatisdifficulttoautomate—especiallyforanairborne observatory. Thedifficultyofdefininga metricofthedataqualitymakesnecessaryin- terventionof experiencedscientists. Final Level3 productscan be archivedin SOFIA databaseaswellastheirassociatedFCP. • Userinitiatedprocessingandinspection(alllevels)[grayarea] PipeliningcanalsobemanuallyinitiatedbySMOscientists. Forexample,theymayre- pipelinethedatawithmodifiedparameterswhichwouldimprovethequalityofthefinal results,orwhenanewversionofapipelinebecomesavailable. • Userinteractiveprocessing(alllevels)[orangearea] Likely,humaninterventionisoftenneededtoverifyresultsatanyofthedatalevels.DCS willprovideaninteractioninterfacetoextractdatafromthearchiveandrunlocallythe same algorithmsused duringautomaticpipelining. Thisallowsthe user to analyze the data at any step of the process, eliminate undesirable data and fine-tune parametersof thereduction. Thisstepwillresultonthedatavalidationortheappropriateparameters requiredtore-pipelinedatainordertoimprovethequalityofthefinalproduct. 4. PipelinesApproaches A pipeline is a collection of algorithms which are run in a particular order. The DCS willhostpipelinescodedinIDL,Pythonandotherlanguageswhicharedeliveredbythe instrument teams with a description specified as XML. With the appropriate pipeline specification, DCScancurrently runpipelines inanylanguage withnomodificationof thecodeassoon asthepipeline isdelivered asanexecutable, likelythesamethatruns inSMOmachinesoutsideDCS.BecauseDCSdoesnothaveaknowledgeofthedetails of the pipeline execution after it is called, we name this pipeline blackbox. Level 2 4 Charcos-LlorensM.V.;KrzaczekR.;ShupingR.Y.;andLinL. blackboxesareappliedbasedonthespecificationsoftheAOR-whichisdetailedbefore the flight as part of the observation planning process and the details of the process are recorded on the PP which is created after pipelining (Section 2). This approach is currently implemented in the DCS and embraces both automatic and user-initiated pipelining within the same framework. This answers the need for re-pipelining with thegoalofimprovingthequalityoftheLevel2databyfinetuningpipelineparameters aftermanualinspectionorapplyinganimprovedversionofthepipeline. Although,this approach represents an enormous cost saving on the implementation and maintenance ofthepipelinesitlackstheadvancedfunctionalities thattheDCScouldofferincluding parallelexecutionofprocessesofasinglepipeline,statusreport,andintermediateuser intervention. Weplan to complement the current functionality with another approach allowing humaninteraction. Userinteractionisrequiredforstep-by-step dataprocessingandin- termediatedataanalysis. ThesewillbeperformedusingaDCSgraphicalinterfacetool whichruns user-interaction data process and analysis tools locally (outside DCS)after downloading updated algorithms from DCS. As a long term goal, DCS will integrate user-interactionpipeliningwithinthesameframeworkasautomaticpipelining. Forthat purpose, pipelines will be delivered as a collection of functions (modules) performing a portion of the pipeline and XMLfiles describing them. Thepipeline recipe (another XMLfile)willdescribehowmodulesareexecuted,theorderofexecutionandhowdata is transfered between modules. Technically, the pipeline manager objects (pipe man) are in charge of executing specific modules (module->process method) or the whole pipeline(pipe man->runmethod). Thisnewapproachfitsintheactualblackboxstruc- turebycallingrunmethodasthepipelineexecutable. WhenimplementedwithinDCS, pipe manwillbeabletoprocessmodulesinparallel,controltheirexecution, andallow user data analysis. In addition, pipe man will manage modules in different computer languages forthesamepipeline thus reducing thenumberofalgorithms inthesystem. Instrument teamswillbeencouraged touse existing algorithms whendeveloping their pipelines, resulting in a common library of algorithms which will decrease the efforts oftheinstrument teamsfordeveloping pipelines and oftheDCSteam for maintaining andupgrading them. 5. Conclusion Combining automatic pipelining and user interaction of processing algorithms which aredevelopedinvariouslanguagespresentsanimportantchallengetotheSOFIADCS — especially when trying to minimize efforts required for long-term maintenance and upgrade of the code. We divide the problem in four distinct cases of interaction with the data. These scenarios can bedeveloped independently but arebased on acommon architecture. The case of automatic pipelining, either at EoF or user-initiated, is al- readyimplementedandhasbeendemonstratedwithFLITECAMdata. User-interactive pipelining isinitsdesignphasebutwehaveshownitsfeasibility usingaprototype im- plemented in IDL. Flux calibration is not included in current DCS development plans due to resource/schedule constraints, but we provide the required tools for the user to ingesthumanvalidated data. Acknowledgments. RYSis suported by USRAContract to the Space Science In- stitute. Formoreinformation aboutSOFIAvisithttp://www.sofia.usra.edu.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.