Table Of Content

Cognitive Computation Trends 2 Series Editor: Amir Hussain Kaizhu Huang · Amir Hussain Qiu-Feng Wang Rui Zhang Editors Deep Learning: Fundamentals, Theory and Applications Cognitive Computation Trends Volume 2 SeriesEditor AmirHussain SchoolofComputing EdinburghNapierUniversity Edinburgh,UK Cognitive Computation Trends is an exciting new Book Series covering cutting- edgeresearch,practicalapplicationsandfuturetrendscoveringthewholespectrum of multi-disciplinary fields encompassed by the emerging discipline of Cognitive Computation. The Series aims to bridge the existing gap between life sciences, social sciences, engineering, physical and mathematical sciences, and humanities. The broad scope of Cognitive Computation Trends covers basic and applied work involving bio-inspired computational, theoretical, experimental and integrative accounts of all aspects of natural and artificial cognitive systems, including: perception, action, attention, learning and memory, decision making, language processing,communication,reasoning,problemsolving,andconsciousness. Moreinformationaboutthisseriesathttp://www.springer.com/series/15648 Kaizhu Huang • Amir Hussain • Qiu-Feng Wang Rui Zhang Editors Deep Learning: Fundamentals, Theory and Applications 123 Editors KaizhuHuang AmirHussain Xi’anJiaotong-LiverpoolUniversity SchoolofComputing Suzhou,China EdinburghNapierUniversity Edinburgh,UK Qiu-FengWang RuiZhang Xi’anJiaotong-LiverpoolUniversity Xi’anJiaotong-LiverpoolUniversity Suzhou,China Suzhou,China ISSN2524-5341 ISSN2524-535X (electronic) CognitiveComputationTrends ISBN978-3-030-06072-5 ISBN978-3-030-06073-2 (eBook) https://doi.org/10.1007/978-3-030-06073-2 LibraryofCongressControlNumber:2019930405 ©SpringerNatureSwitzerlandAG2019 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof thematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation, broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionorinformation storageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilarmethodology nowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevant protectivelawsandregulationsandthereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbook arebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernortheauthorsor theeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinorforany errorsoromissionsthatmayhavebeenmade.Thepublisherremainsneutralwithregardtojurisdictional claimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG. Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Preface Over the past 10 years, deep learning has attracted a lot of attention, and many exciting results have been achieved in various areas, such as speech recognition, computer vision, handwriting recognition, machine translation, and natural language understanding. Rather surprisingly, the performance of machines has even surpassed humans’ in some specific areas. The fast development of deep learning has already started impacting people’s lives; however, challenges still exist. In particular, the theory of successful deep learning has yet to be clearly explained, and realization of state-of-the-art performance with deep learning models requires tremendousamountsoflabelleddata.Further,optimizationofdeeplearningmodels canrequiresubstantiallylongtimesforreal-worldapplications.Hence,mucheffort isstillneededtoinvestigatedeeplearningtheoryandapplyitinvariouschallenging areas.Thisbooklooksatsomeoftheproblemsinvolvedanddescribes,indepth,the fundamental theories, some possible solutions, and latest techniques achieved by researchersintheareasofmachinelearning,computervision,andnaturallanguage processing. Thebookcomprisessixchapters,eachprecededbyanintroductionandfollowed byacomprehensivelistofreferencesforfurtherreadingandresearch.Thechapters aresummarizedbelow: Densitymodelsprovideaframeworktoestimatedistributionsofthedata,which is a major task in machine learning. Chapter 1 introduces deep density models withlatentvariables,whicharebasedonagreedylayer-wiseunsupervisedlearning algorithm. Each layer of deep models employs a model that has only one layer of latentvariables,suchastheMixturesofFactorAnalyzers(MFAs)andtheMixtures ofFactorAnalyzerswithCommonLoadings(MCFAs). Recurrent Neural Networks (RNN)-based deep learning models have been widelyinvestigatedforthesequencepatternrecognition,especiallytheLongShort- term Memory (LSTM). Chapter 2 introduces a deep LSTM architecture and a ConnectionistTemporalClassification(CTC)beamsearchalgorithmandevaluates thisdesignononlinehandwritingrecognition. Following on above deep learning-related theories, Chapters 3, 4, 5 and 6 introduce recent advances on applications of deep learning methods in several v vi Preface areas.Chapter3overviewsthestate-of-the-artperformanceofdeeplearning-based Chinesehandwritingrecognition,includingbothisolatedcharacterrecognitionand textrecognition. Chapters 4 and 5 describe application of deep learning methods in natural language processing (NLP), which is a key research area in artificial intelligence (AI).NLPaimsatdesigningcomputeralgorithmstounderstandandprocessnatural language in the same way as humans do. Specifically, Chapter 4 focuses on NLP fundamentals, such as word embedding or representation methods via deep learning, and describes two powerful learning models in NLP: Recurrent Neural Networks(RNN)andConvolutionalNeuralNetworks(CNN).Chapter5addresses deep learning technologies in a number of benchmark NLP tasks, including entity recognition,super-tagging,machinetranslation,andtextsummarization. Finally,Chapter 6introduces Oceanic dataanalysiswithdeeplearningmodels, focusing on how CNNs are used for ocean front recognition and LSTMs for sea surfacetemperatureprediction,respectively. In summary, we believe this book will serve as a useful reference for senior (undergraduate or graduate) students in computer science, statistics, electrical engineering, as well as others interested in studying or exploring the potential of exploitingdeeplearningalgorithms.Itwillalsobeofspecialinteresttoresearchers intheareaofAI,patternrecognition,machinelearning,andrelatedareas,alongside engineers interested in applying deep learning models in existing or new practical applications.Intermsofprerequisites,readersareassumedtobefamiliarwithbasic machine learning concepts including multivariate calculus, probability and linear algebra,aswellascomputerprogrammingskills. Suzhou,China KaizhuHuang Edinburgh,UK AmirHussain Suzhou,China Qiu-FengWang Suzhou,China RuiZhang March2018 Contents 1 IntroductiontoDeepDensityModelswithLatentVariables ........... 1 XiYang,KaizhuHuang,RuiZhang,andAmirHussain 2 DeepRNNArchitecture:DesignandEvaluation......................... 31 TonghuaSu,LiSun,Qiu-FengWang,andDa-HanWang 3 DeepLearningBasedHandwrittenChineseCharacterandText Recognition ................................................................... 57 Xu-YaoZhang,Yi-ChaoWu,FeiYin,andCheng-LinLiu 4 DeepLearningandItsApplicationstoNaturalLanguage Processing..................................................................... 89 Haiqin Yang, Linkai Luo, Lap Pong Chueng, David Ling, andFrancisChin 5 DeepLearningforNaturalLanguageProcessing........................ 111 JiajunZhangandChengqingZong 6 OceanicDataAnalysiswithDeepLearningModels ..................... 139 GuoqiangZhong,Li-NaWang,QinZhang,EstanislauLima,XinSun, JunyuDong,HuiWang,andBiaoShen Index............................................................................... 161 vii Chapter 1 Introduction to Deep Density Models with Latent Variables XiYang,KaizhuHuang,RuiZhang,andAmirHussain Abstract Thischapterintroducesdeepdensitymodelswithlatentvariableswhich arebasedonagreedylayer-wiseunsupervisedlearningalgorithm.Eachlayerofthe deepmodelsemploysamodelthathasonlyonelayeroflatentvariables,suchasthe Mixtures of Factor Analyzers (MFAs) and the Mixtures of Factor Analyzers with Common Loadings (MCFAs). As the background, MFAs and MCFAs approaches are reviewed. By the comparison between these two approaches, sharing the common loading is more physically meaningful since the common loading is regardedasakindoffeatureselectionorreductionmatrix.Importantly,MCFAscan remarkablyreducethenumberoffreeparametersthanMFAs.Thenthedeepmodels (deepMFAsanddeepMCFAs)andtheirinferencesaredescribed,whichshowthat the greedy layer-wise algorithm is an efficient way to learn deep density models and the deep architectures can be much more efficient (sometimes exponentially) than shallow architectures. The performance is evaluated between two shallow models,andtwodeepmodelsseparatelyonbothdensityestimationandclustering. Furthermore,thedeepmodelsarealsocomparedwiththeirshallowcounterparts. Keywords Deepdensitymodel · Mixtureoffactoranalyzers · Common componentfactorloading · Dimensionalityreduction 1.1 Introduction Density models provide a framework for estimating distributions of the data and therefore emerge as one of the central theoretical approaches for designing machines (Rippel and Adams 2013; Ghahramani 2015). One of the essential X.Yang·K.Huang((cid:2))·R.Zhang Xi’anJiaotong-LiverpoolUniversity,Suzhou,China e-mail:xi.yang@xjtlu.edu.cn;kaizhu.huang@xjtlu.edu.cn;rui.zhang02@xjtlu.edu.cn A.Hussain SchoolofComputing,EdinburghNapierUniversity,Edinburgh,UK e-mail:a.hussain@napier.ac.uk ©SpringerNatureSwitzerlandAG2019 1 K.Huangetal.(eds.),DeepLearning:Fundamentals,TheoryandApplications, CognitiveComputationTrends2,https://doi.org/10.1007/978-3-030-06073-2_1 2 X.Yangetal. probabilistic methods is to adopt latent variables, which reveals data structure and explores features for subsequent discriminative learning. The latent variable models are widely used in machine learning, data mining and statistical analysis. In the recent advances in machine learning, a central task is to estimate the deep architecturesformodelingthetypeofdatawhichhasthecomplexstructuresuchas text, images, and sounds. The deep density models have always been the hot spot for constructing sophisticated density estimates. The existed models, probabilistic graphicalmodels,notonlyprovethatthedeeparchitecturescancreateabetterprior for complex structured data than the shallow density architectures theoretically, but also has practical significance in prediction, reconstruction, clustering, and simulation (Hinton et al. 2006; Hinton and Salakhutdinov 2006). However, they oftenencountercomputationaldifficultiesinpracticeduetoalargenumberoffree parametersandcostlyinferenceprocedures(Salakhutdinovetal.2007). To this end, the greedy layer-wise learning algorithm is an efficient way to learn deep architectures which consist of many latent variables layers. With this algorithm, the first layer is learned by using a model that has only one layer of latentvariables.Especially,thesameschemecanbeextendedtotrainthefollowing layersatatime.Comparedwithpreviousmethods,thisdeeplatentvariablemodel has fewer free parameters by sharing the parameters between successive layers, and a simpler inference procedure due to a concise objective function. The deep densitymodelswithlatentvariablesareoftenusedtosolveunsupervisedtasks.In many applications, the parameters of these density models are determined by the maximum likelihood, which typically adopts the expectation-maximization (EM) algorithm (Ma and Xu 2005; Do and Batzoglou 2008; McLachlan and Krishnan 2007). In Sect.1.4, we shall see that EM is a powerful and elegant method to find themaximumlikelihoodsolutionsformodelswithlatentvariables. 1.1.1 DensityModelwithLatentVariables Density estimation is a major task in machine learning. In general, the most commonlyusedmethodisMaximumLikelihoodE(cid:2)stimate(MLE).Inthisway,we can establish a likelihood function L(μ,Σ) = N lnp(y |μ,Σ).1 However, n=1 n the derivation of directly calculating the likelihood functions has computational difficulties because of the very high dimensions of Σ. Thus, a set of variables x is defined to govern multiple y, and when the distribution of p(x) is found, p(y) can be determined by the joint distribution over y and x. Typically, the covariates Σ are ruled out. In this setting, x is assumed to affect the manifest variables (observablevariables),butitisnotdirectlyobservable.Thus,xisso-calledthelatent variable (Loehlin 1998). Importantly, the introduction of latent variables allows the formation of complicated distributions from simpler components. In statistics, 1yn-Nobservationdata,μ-mean,Σ-covariates

Description:

The purpose of this edited volume is to provide a comprehensive overview on the fundamentals of deep learning, introduce the widely-used learning architectures and algorithms, present its latest theoretical progress, discuss the most popular deep learning platforms and data sets, and describe how ma

Deep Learning: Fundamentals, Theory and Applications PDF

168 Pages·2019·4.43 MB·English

by Kaizhu Huang

Checking for file health...

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Download Deep Learning: Fundamentals, Theory and Applications PDF Free - Full Version

by Kaizhu Huang| 2019| 168 pages| 4.43| English

Download Deep Learning: Fundamentals, Theory and Applications by Kaizhu Huang in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Deep Learning: Fundamentals, Theory and Applications

Detailed Information

Author:	Kaizhu Huang
Publication Year:	2019
Pages:	168
Language:	English
File Size:	4.43
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Deep Learning: Fundamentals, Theory and Applications Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Deep Learning: Fundamentals, Theory and Applications PDF?

Yes, on https://PDFdrive.to you can download Deep Learning: Fundamentals, Theory and Applications by Kaizhu Huang completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Deep Learning: Fundamentals, Theory and Applications on my mobile device?

After downloading Deep Learning: Fundamentals, Theory and Applications PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Deep Learning: Fundamentals, Theory and Applications?

Yes, this is the complete PDF version of Deep Learning: Fundamentals, Theory and Applications by Kaizhu Huang. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Deep Learning: Fundamentals, Theory and Applications PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.