Esteban Zimányi Ralf-Detlef Kutsche (Eds.) l a i r o t u T Business 5 0 2 P Intelligence I B N L 4th European Summer School, eBISS 2014 Berlin, Germany, July 6–11, 2014 Tutorial Lectures 123 Lecture Notes in Business Information Processing 205 Series Editors Wil van der Aalst Eindhoven Technical University, Eindhoven, The Netherlands John Mylopoulos University of Trento, Povo, Italy Michael Rosemann Queensland University of Technology, Brisbane, QLD, Australia Michael J. Shaw University of Illinois, Urbana-Champaign, IL, USA Clemens Szyperski Microsoft Research, Redmond, WA, USA More information about this series at http://www.springer.com/series/7911 á Esteban Zim nyi Ralf-Detlef Kutsche (Eds.) (cid:129) Business Intelligence 4th European Summer School, eBISS 2014 – Berlin, Germany, July 6 11, 2014 Tutorial Lectures 123 Editors Esteban Zimányi Ralf-Detlef Kutsche UniversitéLibre deBruxelles Technische UniverstätBerlin Brussels Berlin Belgium Germany Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthisbookare believedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsortheeditors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissionsthatmayhavebeenmade. Printedonacid-freepaper SpringerInternationalPublishingAGSwitzerlandispartofSpringerScience+BusinessMedia (www.springer.com) Preface The Fourth European Business Intelligence Summer School (eBISS 2014) took place in Berlin, Germany, in July 2014. Tutorials were given by renowned experts and coveredseveralrecenttopicsinbusinessintelligence.Thisvolumecontainsthelecture notes of the summer school. The first chapter surveys the domain of requirements engineering for decision support systems. This is done in the context of a real-world application for analyzing theimpactoftheChagasdisease.Thisdiseaseisclassifiedasalife-threateningdisease bytheWorldHealthOrganization(WHO)andcausesnumerousdeathseveryyear.The developmentoftheChagasInformationDatabase (CID)ispartofWHO’sstrategyfor advancing inthe disease control. CID isa decisionsupport system tosupport national and international authorities in both their day-by-day and long-term decision making. The paper describes the results of applying Pohl’s Framework for the requirements engineering phase of this project. The second chapter presents the application of visual analytics for enabling a multiperspective analysis of mobile phone call data records. The analysis of human mobility is a hot research topic in data mining, geographic information science, and visualanalytics.Whileawidevarietyofmethodsandtoolsareavailable,itisstillhard to systematically consider a dataset from multiple perspectives. The paper presents a workflow that enables a comprehensive analysis of a publicly available dataset about mobilephonecallsofalargepopulationoveralongtimeperiod.Thepaperconcludes by outlining potential applications of the proposed method. ThethirdchaptergivesanoverviewofhowtheWebofdocumentshasevolvedinto whatisreferredtoasLinkedData.Thepaperstartswithadescriptionoftheevolution thatledfromthefirstversionoftheWebtotheWebofdata, sometimes referred toas the Semantic Web. Based on the “data > information > knowledge” hierarchy, the article makes explicit the structures of knowledge representation and the building blocks of the Web of data. Then, the paper shows how RDF (Resource Description Framework) data can be managed and queried. After that, the paper delves on ontol- ogies, from their creation to their alignment and reasoning. The article concludes by pointing out research and development perspectives in the linked data environment. The fourth chapter gives a survey of supervised classification on data streams. Researchinthestatistical learninganddataminingfieldsinthelastdecaderesultedin many learning algorithms that are fast and automatic. However, a strong hypothesis made by these learning algorithms is that all examples can be loaded into memory. Recently, new use cases generating huge amounts of data have appeared, such as monitoringoftelecommunicationnetworks,usermodelingindynamicsocialnetworks, web mining, etc. As the volume of data increases rapidly, it is now necessary to use incremental learning algorithms on data streams. The article presents the main approaches of incremental supervised classification available in the literature. VI Preface Finally,thefifthchapterpresentsasurveyofexistingtechniquesofknowledgereuse and provides a classification approach for them. The importance of managing orga- nizational knowledge for enterprises has been recognized since decades. Indeed, a systematic development and reuse of knowledge will help to improve the competi- tiveness of an enterprise. The paper investigates different approaches for knowledge reuse from computer science and business information systems. It proposes a classi- ficationapproachforthesetechniquesbasedonthefollowingcriteria:reusetechnique, reuse situation, capacity of knowledge representation, addressee of knowledge, vali- dation status, scope, and phase of solution development. Inadditiontothelecturescorrespondingtothechaptersdescribedabove,theeBISS 2014 had three other lectures, as follows: – Frithjof Dau, from SAP Research, Germany: CUBIST - Combining and Uniting Business Intelligence with Semantic Technologies – AsteriosKatsifodimos,fromTechnicalUniversitätBerlin,Germany:BigDatalooks tiny from Stratosphere – Roel J. Wieringa, from University of Twente, The Netherlands: Design Science Methodology for Business Intelligence These lectures have no associated chapter in this volume, because their content is reported in recent publications, respectively, [1], [2], and [3]. We would like to thank the attendants of the summer school for their active par- ticipation, as well as the speakers and their co-authors for the high quality of their contributioninaconstantevolvingandhighlycompetitivedomain.Finally,thelectures in this volume greatly benefit from the comments of the external reviewers. January 2015 Esteban Zimányi Ralf-Detlef Kutsche References 1. Dau, F., Andrews, S.: Combining business intelligence with semantic technologies: the CUBIST project. In: Hernandez, N., Jäschke, R., Croitoru, M. (eds.) ICCS 2014. LNCS (LNAI), vol.8577,pp. 281–286.Springer, Heidelberg (2014) 2. Alexandrov,A.,etal.:Thestratosphereplatformforbigdataanalytics.VLDBJ.23(6),939– 964(2014) 3. Wieringa, R.J.: Design Science Methodology for Information Systems and Software Engi- neering.Springer, Heidelberg (2014) Organization The Fourth European Business Intelligence Summer School (eBISS 2014) was organized by the Department of Computer and Decision Engineering (CoDE) of the Université Libre de Bruxelles, Belgium, and the Database Systems and Information Management (DIMA) Group of the Technische Universität Berlin, Germany. Program Commitee Alberto Abelló Universitat Politécnica de Catalunya, BarcelonaTech, Spain Marie-Aude Aufaure École Centrale Paris, France Ralf-Detlef Kutsche Technische Universität Berlin, Germany Patrick Marcel Université François Rabelais de Tours, France Esteban Zimányi Université Libre de Bruxelles, Belgium External Referees Waqas Ahmed Université Libre de Bruxelles, Belgium Catherine Faron Zucker Université Nice Sophia Antipolis, France Dilshod Ibragimov Université Libre de Bruxelles, Belgium Marite Kirikova Riga Technical University, Latvia Georg Krempl Otto-von-Guericke University Magdeburg, Germany Pascale Kuntz Laboratoire d’Informatique de Nantes Atlantique, France Gerasimos Marketos University of Piraeus, Greece Faisal Orakzai Université Libre de Bruxelles, Belgium Contents OntheComplexityofRequirementsEngineeringforDecision-SupportSystems: TheCIDCaseStudy ............................................... 1 Ruth Raventós, Stephany García, Oscar Romero, Alberto Abelló, and Jaume Viñas Multi-perspective Analysis of Mobile Phone Call Data Records: A Visual Analytics Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Gennady Andrienko, Natalia Andrienko, and Georg Fuchs From the Web of Documents to the Linked Data . . . . . . . . . . . . . . . . . . . . 60 Gabriel Képéklian, Olivier Curé, and Laurent Bihanic A Survey on Supervised Classification on Data Streams . . . . . . . . . . . . . . . 88 Vincent Lemaire, Christophe Salperwyck, and Alexis Bondu Knowledge Reuse: Survey of Existing Techniques and Classification Approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Kurt Sandkuhl Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 On the Complexity of Requirements Engineering for Decision-Support Systems: The CID Case Study B Ruth Ravento´s, Stephany Garc´ıa, Oscar Romero( ), Alberto Abello´, and Jaume Vin˜as Department of Service and Information Systems Engineering, Universitat Polit`ecnica de Catalunya, Barcelona, Spain {raventos,sgarcia,oromero,aabello,jvinas}@essi.upc.edu Abstract. The Chagas disease is classified as a life-threatening disease bytheWorldHealthOrganization(WHO)andiscurrentlycausingdeath to534,000peopleeveryyear.Inordertoadvancewiththediseasecontrol, theWHOpresentedastrategythatincludedthedevelopmentoftheCha- gasInformationDatabase(CID)forsurveillancetoraiseawarenessabout Chagas.CIDisdefinedasadecision-supportsystemtosupportnational and international authorities in both their day-by-day and long-term decision making. The requirements engineering to develop this project wasparticularlycomplexandPohl’sframeworkwasfollowed.Thispaper describes the results of applying the framework in this project. Thus, it focuses on the requirements engineering stage. The difficulties found motivated the further study and analysis of the complexity of require- mentsengineeringindecision-supportsystemsandthefeasibilityofusing said framework. · · Keywords: Decision-supportsystems Businessintelligence Require- ments engineering 1 Introduction TheWorldHealthOrganization(WHO)1,aUnitedNations(UN)agencyfounded in 1948, is the directing authority for health matters, responsible to provide leadership on global health matters by setting standards and providing techni- cal support to monitor the health trends. In October 2010, the WHO launched the First Report on Neglected Tropical Diseases (NTD) [1], which included the “Chagas disease” among others. Chagas Disease (also known as Human Ameri- can Trypanosomiasis) is classified as a life-threatening disease caused by a par- asite named “protozoan parasite”. According to the report, Chagas is nowadays usuallyfoundin21LatinAmericancountries,wherethediseaseistransmittedto humansbythefaecesofinfectedtriatomebugs.TheWHOreporthasestimated 1 http://www.who.int/about/en/. (cid:2)c SpringerInternationalPublishingSwitzerland2015 E.Zim´anyiandR.-D.Kutsche(Eds.):eBISS2014,LNBIP205,pp.1–38,2015. DOI:10.1007/978-3-319-17551-51

