ebook img

From Data Mining to Big Data & Data Science: a PDF

98 Pages·2013·2.29 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview From Data Mining to Big Data & Data Science: a

From Data Mining to Big Data & Data Science: a Computational Perspective e v i t c e p Miquel Sànchez-Marrè s r e P l a n Knowledge Engineering & Machine Learning Group (KEMLG) o i at Dept. de Llenguatges i Sistemes Informàtics t u p Universitat Politècnica de Catalunya-BarcelonaTech m o C [email protected] a : http://www.lsi.upc.edu/~miquel S D & 19 de Julio 2013 D B https://kemlg.upc.edu o t M D m o r F Content Introduction  Knowledge Discovery in Databases e  v i t Data Pre-processing c  e p s Data Mining Techniques: Statistical & ML methods  r e P Data Post-Processing  l a n Big Data / Data Science o  i t a ut  Social Mining p m o  Scaling from Data Mining to Big Mining C a Computational Techniques  : S D Examples of Extrapolation of classical DM Techniques  & D A look to Mahout  B o t Big Data Tools & Resources  M D Big Data Trends m  o r F 2 Big Data & Data Science, Curso de Verano, 16-19 de julio de 2013, Universidade da Santiago de Compostela INTRODUCTION e v i t c e p s r e P l a n o i t a t u p m o C a : S D & D B https://kemlg.upc.edu o t M D m o r F Complex Real-world Systems/Domains Exist in the daily real life of human beings, and normally show a  e strongcomplexityfor their understanding, analysis, management or v i solving. t c e p They imply several decision making tasks very complexand  s r e difficult, which usually are faced up by human experts P al  Some of them, in addition, could have catastrophicconsequences n o either for human beings or for the environment or for the economy of i t a one organization t u p m  Examples o Environmental System/Domains C  a Medical System/Domains  : S Industrial Process Management Systems/Domains D  &  Business Administration & Management Systems/Domains D Marketing  B o  Decisions on products and prices t Decisions on human resources  M D  Decisions on strategies and company policy m Internet  o r F 4 Big Data & Data Science, Curso de Verano, 16-19 de julio de 2013, Universidade da Santiago de Compostela Need for Decision Making Support Tools Complexity of the decision making process  e Accurate Evaluation of multiple alternatives v  i t c e Need for forecasting capabilities p  s r e Uncertainty P  l a n Data Analysis and Exploitation  o i t a Need for including experience and expertise (knowledge) t  u p m o C a : S D Computational tools: Decision Support Systems (DSS)  & D Intelligent Computational Tools: Intelligent Decision B  o Support Systems (IDSS) t M D m o r F 5 Big Data & Data Science, Curso de Verano, 16-19 de julio de 2013, Universidade da Santiago de Compostela Management Information Systems MANAGEMENT INFORMATION SYSTEMS e v i ct DECISION TRANSACTION e SUPPORT p PROCESSING s SYSTEMS r SYSTEMS e (DSS) P l a INTELLIGENT DECISION SUPPORT SYSTEMS (IDSS) n o i t a t u AUTOMATIC p MODEL-DRIVEN DATA-DRIVEN MODEL-DRIVEN m DSS DSS o DSS C a : S D TOP-DOWN BOTTOM-UP & APPROACH / APPROACH / D CONFIRMATIVE EXPLORATIVE B MODELS MODELS o t M STATISTICAL / ARTIF. INTELLIG / D NUMERICAL MACH. LEARN. m o METHODS METHODS r F 6 Big Data & Data Science, Curso de Verano, 16-19 de julio de 2013, Universidade da Santiago de Compostela USER T ECONOMICAL LAWS & R O COSTS REGULATIONS P USER INTERFACE P U S N O I S CI EXPLANATION ALTERNATIVES EVAL. e E D ctiv Intelligent PLANNING / PREDICTION / SUPERVISION DStercaitseiogny/ e S REASONING / MODELS’ INTEGRATION p I Decision S s O er GN AI STATISTICAL NUMERICAL A P Support I MODELS MODELS MODELS D l a DATA MINING Feedback G n Systems N KNOWLEDGE ACQUISITION / LEARNING o NI Background/ i at (IDSS) MI Subjective t A Knowledge u T A p [Sànchez-Marrè et al., 2000] D GIS DATA BASE m (SPATIAL DATA) (TEMPORAL DATA) o N C O a TI ON-LINE / A Off-line T OFF-LINE S: RE On-line data ACTUATORS D P R data E Spatial / & T N Geographical SUBJECTIVE I D data INFORMATION / Actuation A B T ANALYSES A D SENSORS o t M D m SYSTEM / PROCESS o r F 7 Big Data & Data Science, Curso de Verano, 16-19 de julio de 2013, Universidade da Santiago de Compostela Why Data Mining is important ? From observations to decisions [Adapted from A.D. Witakker, 1993] e DDEECCIISSIIOONN v i ct HHiigghh e RReeccoommeennddaattiioonnss p s r e P IINNTTEERRPPRREETTAATTIIOONN CCoonnsseeqquueenncceess l a n o PPrreeddiiccttiioonnss ti VVaalluuee aanndd a ut RReelleevvaannccee KKNNOOWWLLEEDDGGEE p m ttoo o DDeecciissiioonnss UUnnddeerrssttaannddiinngg C a : DDaattaa S D & OObbsseerrvvaattiioonnss D B LLooww o t M D HHiigghh LLooww QQuuaannttiiyy ooff IInnffoorrmmaattiioonn m o r F 8 Big Data & Data Science, Curso de Verano, 16-19 de julio de 2013, Universidade da Santiago de Compostela KNOWLEDGE DISCOVERY / DATA MINING e v i t c e p s Intelligent Data Analysis r e P l a n o i t a t u p m o C a : S D & D B https://kemlg.upc.edu o t M D m o r F KDD Concepts e  KDD v i t c Knowledge Discovery in Databases is the non-trivial e  p process of identifying valid, novel, potentially useful, and s r e ultimately understandable patterns in data [Fayyad et al., P l a 1996] n o ati  KDD is a multi-step process t u p Data Mining  m o C  Data Mining is a step in the KDD process consisting of a particular data mining algorithms that, under some : S D aceptable computational efficiency limitations, produces a & particular set of patterns over a set of examples or cases D B or data. [Fayyad et al., 1996] o t M D m o r F 10 Big Data & Data Science, Curso de Verano, 16-19 de julio de 2013, Universidade da Santiago de Compostela

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.