ebook img

Mining Abstractions in Scientific Workflows PDF

251 Pages·2015·5.08 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Mining Abstractions in Scientific Workflows

Departamento de Inteligencia Artificial Escuela T´ecnica Superior de Ingenieros Inform´aticos PhD Thesis Mining Abstractions in Scientific Workflows Author: Daniel Garijo Verdejo Supervisors: Prof. Dr. Oscar Corcho Prof. Dra. Yolanda Gil December, 2015 ii TribunalnombradoporelSr. RectorMagfco. delaUniversidadPolit´ecnicadeMadrid, el d´ıa 30 de octubre de 2015 Presidente: Dra. Asunci´on G´omez P´erez Vocal: Dr. Jose Manuel G´omez P´erez Vocal: Dr. Malcolm Atkinson Vocal: Dr. Rafael Tolosana Secretario: Dr. Mark Wilkinson Suplente: Dr. Mariano Fern´andez L´opez Suplente: Dra. Bel´en D´ıaz Agudo Realizado el acto de defensa y lectura de la Tesis el d´ıa 3 de diciembre de 2015 en la Facultad de Inform´atica Calificac´ıon: EL PRESIDENTE VOCAL 1 VOCAL 2 VOCAL 3 EL SECRETARIO iii iv A mis padres v vi Acknowledgements Finally, after five years, I can finally say that I see light at the end of the tunnel. Maybe the other side is still a bit cloudy at the moment, but the important thing is to have arrived here. And, honestly, I think I wouldn’t have made it to this point without all the people who have been by my side during these years. First, I would like to thank my supervisors Oscar Corcho and Yolanda Gil for guiding me whenever I got stuck and for having the patience to answer allmyquestions. Furthermore,thankstotheirhelp,togetherwithAsunci´on G´omez P´erez’s advice, I was granted the FPU (Formaci´on de Profesorado Universitario)scholarshipfromtheMinisteriodeCienciaeInnovaci´on. This scholarship has funded the internships and the research described on this document, and I am very grateful for having had the opportunity to enjoy it. I would also like to thank my family, specially my parents (Francisco Javier and Mar´ıa Felisa) and my sister Elisa for all their support, advice and suggestions during this period. Even from the distance! Next up are my lab mates, who have helped me with the figures (Mar´ıa Poveda, I really think you could write a thesis just by doing cool figures), logos (Idafen Santana, also responsible for our soccer team), technical sup- port (Miguel Angel Garc´ıa and Rau´l Alc´azar), advice for the thesis (Andr´es Garc´ıa and Esther Lozano) or just cheering me up when hanging out with them (Dani, Freddy, Carlos, Pablo, Julia, Boris, Alejandro, Olga and Vic- tor). In this regard, I am also very grateful to my friends Sergio, Paloma, David, Cristina and Javier for being always available to have a chat with a beer and discuss things totally unrelated to this thesis. IalsoowespecialthankstoPaoloMissierandKhalidBelhajjame, whohave provided very valuable feedback with very little time for doing the review. Next, Varun Ratnakar has always been crucial for some of the technical parts described in this thesis. Varun is one of the best working colleagues one could ever ask for. And finally, I want to thank all the collaborators and projects pals I have interacted with during these years, from the w4Ever team (with Carole, Jun, Graham, Rau´l, Piotr, Stian, Khalid, Kristina, Lourdes, Susana, Pique) to the people I have met during my internships at the ISI (Dirk, John, Matheus, Felix, Zori). Abstract Scientific workflows have been adopted in the last decade to represent the computational methods used in in silico scientific experiments and their associated research products. Scientific workflows have demonstrated to be usefulforsharingandreproducingscientificexperiments, allowingscientists to visualize, debug and save time when re-executing previous work. How- ever, scientific workflows may be difficult to understand and reuse. The large amount of available workflows in repositories, together with their het- erogeneity and lack of documentation and usage examples may become an obstacle for a scientist aiming to reuse the work from other scientists. Fur- thermore, giventhatitisoftenpossibletoimplementamethodusingdiffer- ent algorithms or techniques, seemingly disparate workflows may be related at a higher level of abstraction, based on their common functionality. In this thesis we address the issue of reusability and abstraction by exploring how workflows relate to one another in a workflow repository, mining ab- stractions that may be helpful for workflow reuse. In order to do so, we propose a simple model for representing and relating workflows and their executions, we analyze the typical common abstractions that can be found in workflow repositories, we explore the current practices of users regarding workflowreuseandwedescribeamethodfordiscoveringusefulabstractions for workflows based on existing graph mining techniques. Our results ex- pose the common abstractions and practices of users in terms of workflow reuse, and show how our proposed abstractions have potential to become useful for users designing new workflows. ix x

Description:
the fragment found will merge them . 150. 7.11 Exact FSM results for corpus WC1 to WC4 using the gSpan algorithm. 154 xxii .. 2011. 2. Daniel Garijo, Pinar Alper, Khalid Belhajjame, Oscar Corcho, Yolanda Gil, and. Carole Goble. Common Motifs in Scientific Workflows: An
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.