Table Of Content

Studies in Big Data 7 Noel Lopes Bernardete Ribeiro Machine Learning for Adaptive Many- Core Machines – A Practical Approach Studies in Big Data Volume 7 Serieseditor JanuszKacprzyk,PolishAcademyofSciences,Warsaw,Poland e-mail:kacprzyk@ibspan.waw.pl Forfurthervolumes: http://www.springer.com/series/11970 AboutthisSeries Theseries“StudiesinBigData”(SBD)publishesnewdevelopmentsandadvances in the variousareas of Big Data- quickly and with a high quality. The intent is to coverthetheory,research,development,andapplicationsofBigData,asembedded inthefieldsofengineering,computerscience,physics,economicsandlifesciences. The booksof the seriesrefer to the analysisand understandingof large,complex, and/ordistributeddatasetsgeneratedfromrecentdigitalsourcescomingfromsen- sors or other physical instruments as well as simulations, crowd sourcing, social networksor other internet transactions, such as emails or video click streams and other.Theseriescontainsmonographs,lecturenotesandeditedvolumesinBigData spanningtheareasofcomputationalintelligenceincl.neuralnetworks,evolutionary computation,softcomputing,fuzzysystems, as wellas artificialintelligence,data mining, modern statistics and Operationsresearch, as well as self-organizingsys- tems. Of particular value to both the contributorsand the readership are the short publicationtimeframeandtheworld-widedistribution,whichenablebothwideand rapiddisseminationofresearchoutput. · Noel Lopes Bernardete Ribeiro Machine Learning for Adaptive Many-Core Machines – A Practical Approach ABC NoelLopes BernardeteRibeiro PolytechnicInstituteofGuarda DepartmentofInformaticsEngineering Guarda FacultyofSciencesandTechnology Portugal UniversityofCoimbra,PoloII Coimbra Portugal ISSN2197-6503 ISSN2197-6511 (electronic) ISBN978-3-319-06937-1 ISBN978-3-319-06938-8 (eBook) DOI10.1007/978-3-319-06938-8 SpringerChamHeidelbergNewYorkDordrechtLondon LibraryofCongressControlNumber:2014939947 (cid:2)c SpringerInternationalPublishingSwitzerland2015 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped.Exemptedfromthislegalreservationarebriefexcerptsinconnection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’slocation,initscurrentversion,andpermissionforusemustalwaysbeobtainedfromSpringer. PermissionsforusemaybeobtainedthroughRightsLinkattheCopyrightClearanceCenter.Violations areliabletoprosecutionundertherespectiveCopyrightLaw. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Whiletheadviceandinformationinthisbookarebelievedtobetrueandaccurateatthedateofpub- lication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityforany errorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,withrespect tothematerialcontainedherein. Printedonacid-freepaper SpringerispartofSpringerScience+BusinessMedia(www.springer.com) To SaraandPedro To my family Noel Lopes To Miguel andAlexander To my family BernardeteRibeiro Preface Motivationand Scope Todaytheincreasingcomplexity,performancerequirementsandcostofcurrent(and future) applications in society is transversal to a wide range of activities, from science to business and industry. In particular, this is a fundamental issue in the Machine Learning (ML) area, which is becoming increasingly relevant in a wide diversity of domains. The scale of the data from Web growth and advances in sensor data collectiontechnologyhave beenrapidlyincreasingthe magnitudeand complexityoftasksthatMLalgorithmshavetosolve. Much of the data that we are generating and capturing will be available “indefinitely” since it is considered a strategic asset from which useful and valuable information can be extracted. In this context, Machine Learning (ML) algorithms play a vital role in providing new insights from the abundant streams and increasingly large repositories of data. However, it is well-known that the computational complexity of ML methodologies, often directly related with the amount of data, is a limiting factor that can render the application of many algorithms to real-world problems impractical. Thus, the challenge consists of processing such large quantities of data in a realistic (useful) time frame, which drivestheneedtoextendtheapplicabilityofexistingMLalgorithmsandtodevise parallelalgorithmsthatscale well with the volumeof data or, in other words, can handle“BigData”. This volume takes a practical approach for addressing this problematic, by presentingwaystoextendtheapplicabilityofwell-knownMLalgorithmswiththe help of high-scalable Graphics Processing Unit (GPU) parallel implementations. Modern GPUs are highly parallel devices that can perform general-purpose computations, yielding significant speedups for many problems in a wide range of areas. Consequently, the GPU, with its many cores, represents a novel and compellingsolutiontotackletheaforementionedproblem,byprovidingthemeans toanalyzeandstudylargerdatasets. VIII Preface Rationally, we can not view the GPU implementations of ML algorithms as a universal solution for the “Big Data” challenges, but rather as part of the answer, which may requirethe use of differentstrategiescoupledtogether. In this perspective, this volume addresses other strategies, such as using instance-based selection methods to choose a representative subset of the original training data, whichcaninturnbeusedtobuildmodelsinafractionofthetimeneededtoderivea modelfromthecompletedataset.Nevertheless,largescaledatasetsanddatastreams may require learning algorithms that scale roughly linearly with the total amount of data. Hence, traditional batch algorithms may not be up to the challenge and thereforethebookalsoaddressesincrementallearningalgorithmsthatcontinuously adjusttheirmodelswithupcomingnewdata.Theseembodythepotentialtohandle the gradual concept drifts inherent to data streams and non-stationary dynamic databases. Finally,inpracticalscenarios,theawarenessofhandlinglargequantitiesofdata is often exacerbated by the presence of incomplete data, which is an unavoidable problemformostreal-worlddatabases.Therefore,thisvolumealsopresentsanovel strategy fordealing with this ubiquitousproblemthatdoesnotaffectsignificantly eitherthealgorithmsperformanceorthepreprocessingburden. The book is not intended to be a comprehensive survey of the state-of-the-art of the broad field of Machine Learning. Its purpose is less ambitious and more practical: to explain and illustrate some of the more important methods brought to a practical view of GPU-based implementation in part to respond to the new challengesoftheBigData. PlanandOrganization Thebookcomprehendsninechaptersandoneappendix.Thechaptersareorganized intofourparts:thefirstpartrelatingtofundamentaltopicsinMachineLearningand GraphicsProcessingUnitsenclosesthefirsttwochapters;thesecondpartincludes fourchaptersandgivesthemainsupervisedlearningalgorithms,includingmethods to handle missing data and approaches for instance-based learning; the third part withtwochaptersconcernsunsupervisedandsemi-supervisedlearningapproaches; in the fourthpartwe concludethe bookwith a summaryofmany-corealgorithms approaches and techniques developed across this volume and give new trends to scale up algorithmsto many-coreprocessors. The self-containedchaptersprovide anenlightenedviewoftheinterplaybetweenMLandGPUapproaches. Chapter 1 details the Machine Learning challenges on Big Data, gives an overviewof the topicsincludedin the book,and containsbackgroundmaterialon MLformulatingtheproblemsettingandthemainlearningparadigms. Chapter2presentsanewopen-sourceGPUMLlibrary(GPUMachineLearning Library–GPUMLib)thataimsatprovidingthebuildingblocksforthedevelopment ofefficientGPUMLsoftware.Inthiscontext,weanalyzethepotentialoftheGPU in the ML area, coveringits evolution.Moreover,an overviewof the existingML Preface IX GPU parallel implementations is presented and we argue for the need of a GPU ML library. We then present the CUDA (Compute Unified Device Architecture) programming model and architecture, which was used to develop GPU Machine LearningLibrary(GPUMLib)andwedetailitsarchitecture. Chapter 3 reviews the fundamentals of Neural Networks, in particular, the multi-layered approaches and investigates techniques for reducing the amount of time necessary to build NN models. Specifically, it focuses on details of a GPU parallel implementation of the Back-Propagation (BP) and Multiple Back- Propagation (MBP) algorithms. An Autonomous Training System (ATS) that reducessignificantlytheeffortnecessaryforbuildingNNmodelsisalsodiscussed. A practicalapproachtosupporttheeffectivenessof theproposedsystemsonboth benchmarkandreal-worldproblemsispresented. Chapter 4 analysesthe treatmentof missing data and alternativesto dealwith this ubiquitous problem generated by numerous causes. It reviews missing data mechanisms as well as methods for handling Missing Values (MVs) in Machine Learning.Unlikepre-processingtechniques,suchas imputation,a novelapproach Neural Selective Input Model (NSIM) is introduced. Its application on several datasets with both different distributions and proportion of MVs shows that the NSIM approach is very robust and yields good to excellent results. With the scalabilityinmindaGPUparalellimplementationofNeuralSelectiveInputModel (NSIM)tocopewithBigDataisdescribed. Chapter 5 considers a class of learning mechanisms known as the Support Vector Machines (SVMs). It provides a general view of the machine learning framework and describes formally the SVMs as large margin classifiers. It explorestheSequentialMinimalOptimization(SMO)algorithmasanoptimization methodologyto solve an SVM. The rest of the chapteris dedicatedto the aspects related to its implementation in multi-thread CPU and GPU platforms. We also present a comprehensive comparison of the evaluation methods on benchmark datasets and on real-world case studies. We intend to give a clear understanding ofspecificaspectsrelatedtotheimplementationofbasicSVMmachinesinamany- core perspective. Further deploymentof other SVM variants are essential for Big Dataanalyticsapplications. Chapter 6 addresses incremental learning algorithms where the models incorporate new information on a sample-by-sample basis. It introduces a novel algorithm the Incremental Hypersphere Classifier Incremental Hypersphere Classifier (IHC) which presents good properties in terms of multi-class support, complexity, scalability and interpretability. The IHC is tested in well-known benchmarks yielding good classification performance results. Additionally, it can beusedasaninstanceselectionmethodsinceitpreservesclassboundarysamples. Details of its application to a real case study in the field of bioinformatics are provided. Chapter 7 deals with unsupervised and semi-supervised learning algorithms. It presents the Non-Negative Matrix Factorization (NMF) algorithm as well as a new semi-supervised method,designated by Semi-Supervised NMF (SSNMF). In addition,thisChapteralsocoversahybridNMF-basedfacerecognitionapproach. X Preface Chapter8motivatesforthedeeplearningarchitectures.Itstartsbyintroducing theRestrictedBoltzmannMachines(RBMs)andtheDeepBeliefNetworks(DBNs) models.Beingunsupervisedlearningapproachestheirimportanceisshowninmul- tiple facetsspecifically bythefeaturegenerationthroughmanylayers,contrasting with shallowarchitectures.We addresstheir GPU parallelimplementationsgiving adetailedexplanationofthekernelsinvolved.Itincludesanextensiveexperiment, involvingtheMNISTdatabaseofhand-writtendigitsandtheHHrecomulti-stroke symboldatabaseinordertogainabetterunderstandingoftheDBNs. InthefinalChapter9wegiveanextendedsummaryofthecontributionsofthe book.Inadditionwepresentresearchtrendswithspecialfocusonthebigdataand streamcomputing.Finally,tomeetfuturechallengesonreal-timebigdataanalysis fromthousandsof sourcesnew platformsshouldbe exploitedto accelerate many- coresoftwareresearch. Audience The book is designed for practitioners and researchers in the areas of Machine Learning (ML) and GPU computing (CUDA) and is suitable for postgraduate studentsincomputerscience,engineering,informationtechnologyandotherrelated disciplines. Previous backgroundin the areas of ML or GPU computing (CUDA) willbebeneficial,althoughweattempttocoverthebasicsofthesetopics. Acknowledgments Wewouldliketoacknowledgeandthankallthosewhohavecontributedtobringing thisbooktopublicationfortheirhelp,supportandinput. We thank many stimulating user’s requirementsto include new perspectivesin the GPUMLib due to many downloads of the software. It turn out possible to improveandextendmanyaspectsofthelibrary. WealsowishtothankthesupportofthePolytechnicInstituteofGuardaandof theCentreofInformaticsandSystemsoftheInformaticsEngineeringDepartment, FacultyofScienceandTechnologies,UniversityofCoimbra,forthemeansprovided duringtheresearch. Our thanks to Samuel Walter Best who reviewed the syntactic aspects of the book. Ourspecialthanksandappreciationtooureditor,ProfessorJanuszKacprzyk,of StudiesinBigData,Springer,forhisessentialencouragement. Lastly,toourfamiliesandfriendsfortheirloveandsupport. Coimbra,Portugal NoelLopes February2014 BernardeteRibeiro

Description:

The overwhelming data produced everyday and the increasing performance and cost requirements of applications are transversal to a wide range of activities in society, from science to industry. In particular, the magnitude and complexity of the tasks that Machine Learning (ML) algorithms have to solv

Machine Learning for Adaptive Many-Core Machines - A Practical Approach PDF

251 Pages·2015·1.3 MB·English

by Noel Lopes

Checking for file health...

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Download Machine Learning for Adaptive Many-Core Machines - A Practical Approach PDF Free - Full Version

by Noel Lopes| 2015| 251 pages| 1.3| English

Download Machine Learning for Adaptive Many-Core Machines - A Practical Approach by Noel Lopes in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Machine Learning for Adaptive Many-Core Machines - A Practical Approach

Detailed Information

Author:	Noel Lopes
Publication Year:	2015
Pages:	251
Language:	English
File Size:	1.3
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Machine Learning for Adaptive Many-Core Machines - A Practical Approach Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Machine Learning for Adaptive Many-Core Machines - A Practical Approach PDF?

Yes, on https://PDFdrive.to you can download Machine Learning for Adaptive Many-Core Machines - A Practical Approach by Noel Lopes completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Machine Learning for Adaptive Many-Core Machines - A Practical Approach on my mobile device?

After downloading Machine Learning for Adaptive Many-Core Machines - A Practical Approach PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Machine Learning for Adaptive Many-Core Machines - A Practical Approach?

Yes, this is the complete PDF version of Machine Learning for Adaptive Many-Core Machines - A Practical Approach by Noel Lopes. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Machine Learning for Adaptive Many-Core Machines - A Practical Approach PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.