ebook img

Markov models for pattern recognition: from theory to applications PDF

256 Pages·2008·2.62 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Markov models for pattern recognition: from theory to applications

MarkovModels for PatternRecognition Gernot A. Fink Markov Models for Pattern Recognition From Theory to Applications With51Figures 123 GernotA.Fink DepartmentofComputerScience UniversityofDortmund Otto-Hahn-Str.16 44221Dortmund Germany [email protected] LibraryofCongressControlNumber:2007935304 Originallypublished intheGermanlanguage byB.G.Teubner Verlagas“Gernot A.Fink: MustererkennungmitMarkov-Modellen”.© B.G.TeubnerVerlag|GWVFachverlageGmbH, Wiesbaden2003 ISBN 978-3-540-71766-9 SpringerBerlinHeidelbergNewYork Thisworkissubject tocopyright. Allrightsarereserved,whether thewholeorpartofthe materialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations, recitation,broadcasting,reproductiononmicrofilmorinanyotherway,andstorageindata banks.Duplicationofthispublicationorpartsthereofispermittedonlyundertheprovisions oftheGermanCopyrightLawofSeptember9,1965,initscurrentversion,andpermission forusemustalwaysbeobtainedfromSpringer.Violationsareliableforprosecutionunder theGermanCopyrightLaw. SpringerisapartofSpringerScience+BusinessMedia springer.com ©Springer-VerlagBerlinHeidelberg2008 Theuseofgeneraldescriptivenames,registerednames,trademarks,etc.inthispublication doesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. Typesetting:bytheAuthor Production:LE-TEXJelonek,Schmidt&VöcklerGbR,Leipzig Coverdesign:KünkelLopkaWerbeagentur,Heidelberg Printedonacid-freepaper 33/3180/YL-543210 Formyparents Preface The developmentof pattern recognition methodson the basis of so-called Markov models is tightly coupled to the technological progress in the field of automatic speech recognition.Today,however,Markovchain andhiddenMarkovmodelsare also applied in many other fields where the task is the modeling and analysis of chronologicallyorganizeddata,forexamplegeneticsequencesorhandwrittentexts. Nevertheless,in monographs,Markovmodelsare almostexclusivelytreated in the contextofautomaticspeechrecognitionandnotasageneral,widelyapplicabletool ofstatisticalpatternrecognition. In contrast, this book puts the formalism of Markov chain and hidden Markov modelsatthecenterofitsconsiderations.Withtheexampleofthethreemainappli- cationareasofthistechnology—namelyautomaticspeechrecognition,handwriting recognition,andtheanalysisofgeneticsequences—thisbookdemonstrateswhich adjustmentstotherespectiveapplicationareaarenecessaryandhowthesearereal- izedincurrentpatternrecognitionsystems.Besidesthetreatmentofthetheoretical foundations of the modeling, this book puts special emphasis on the presentation ofalgorithmicsolutions,whichareindispensableforthesuccessfulpracticalappli- cation of Markov model technology. Therefore, it addresses researchers and prac- titioners from the field of pattern recognition as well as graduate students with an appropriatemajorfieldofstudy,whowanttodevotethemselvestospeechorhand- writing recognition,bioinformatics,or related problemsand want to gain a deeper understandingoftheapplicationofstatisticalmethodsintheseareas. Theoriginsofthisbooklieintheauthor’sextensiveresearchanddevelopmentin thefieldofstatisticalpatternrecognition,whichinitiallyledtoaGermanbookpub- lishedbyTeubner,Wiesbaden,in2003.Thepresenteditionisbasicallyatranslation of the German versionwith several updatesand modificationsaddressing an inter- nationalaudience.Thisbookwouldnothavebeenpossiblewithouttheencourage- mentandsupportofmycolleagueThomasPlo¨tz,UniversityofDortmund,Germany, whomIwouldliketocordiallythankforhisefforts. Dortmund,July2007 GernotA.Fink Contents 1 Introduction................................................... 1 1.1 ThematicContext .......................................... 3 1.2 FunctionalPrinciplesofMarkovModels ....................... 3 1.3 GoalandStructureoftheBook ............................... 5 2 ApplicationAreas.............................................. 7 2.1 Speech ................................................... 7 2.2 Writing................................................... 14 2.3 BiologicalSequences ....................................... 22 2.4 Outlook................................................... 26 PartI Theory 3 FoundationsofMathematicalStatistics ........................... 33 3.1 RandomExperiment,Event,andProbability.................... 33 3.2 RandomVariablesandProbabilityDistributions................. 35 3.3 ParametersofProbabilityDistributions ........................ 37 3.4 NormalDistributionsandMixtureModels...................... 38 3.5 StochasticProcessesandMarkovChains....................... 40 3.6 PrinciplesofParameterEstimation ............................ 41 3.6.1 MaximumLikelihoodEstimation....................... 41 3.6.2 MaximumaposterioriEstimation ...................... 43 3.7 BibliographicalRemarks .................................... 44 4 VectorQuantization............................................ 45 4.1 Definition................................................. 46 4.2 Optimality ................................................ 47 Nearest-NeighborCondition ................................. 47 CentroidCondition......................................... 48 4.3 AlgorithmsforVectorQuantizerDesign ....................... 50 X Contents Lloyd’sAlgorithm.......................................... 50 LBGAlgorithm............................................ 52 k-Means-Algorithm ........................................ 53 4.4 EstimationofMixtureDensityModels......................... 55 EMalgorithm ............................................. 56 4.5 BibliographicalRemarks .................................... 59 5 HiddenMarkovModels......................................... 61 5.1 Definition................................................. 61 5.2 ModelingEmissions ........................................ 63 5.3 UseCases................................................. 65 5.4 Notation .................................................. 67 5.5 Evaluation ................................................ 68 5.5.1 TheProductionProbability ............................ 68 ForwardAlgorithm .................................. 69 5.5.2 The“Optimal”ProductionProbability................... 71 5.6 Decoding ................................................. 74 ViterbiAlgorithm.................................... 75 5.7 ParameterEstimation ....................................... 76 5.7.1 Foundations......................................... 76 Forward-BackwardAlgorithm ......................... 78 5.7.2 TrainingMethods.................................... 79 Baum-WelchAlgorithm .............................. 80 Viterbitraining...................................... 86 Segmentalk-Means.................................. 88 5.7.3 MultipleObservationSequences ....................... 90 5.8 ModelVariants............................................. 91 5.8.1 AlternativeAlgorithms ............................... 91 5.8.2 AlternativeModelArchitectures........................ 92 5.9 BibliographicalRemarks .................................... 92 6 n-GramModels................................................ 95 6.1 Definition................................................. 95 6.2 UseCases................................................. 96 6.3 Notation .................................................. 97 6.4 Evaluation ................................................ 98 6.5 ParameterEstimation ....................................... 100 6.5.1 RedistributionofProbabilityMass...................... 101 Discounting......................................... 101 6.5.2 IncorporationofMoreGeneralDistributions ............. 104 Interpolation........................................ 104 BackingOff ........................................ 106 6.5.3 OptimizationofGeneralizedDistributions ............... 107 6.6 ModelVariants............................................. 109 6.6.1 Category-BasedModels............................... 109 Contents XI 6.6.2 LongerTemporalDependencies........................ 111 6.7 BibliographicalRemarks .................................... 112 PartII Practice 7 ComputationswithProbabilities................................. 119 7.1 LogarithmicProbabilityRepresentation........................ 120 7.2 LowerBoundsforProbabilities............................... 122 7.3 CodebookEvaluationforSemi-ContinuousHMMs .............. 123 7.4 ProbabilityRatios .......................................... 124 8 ConfigurationofHiddenMarkovModels ......................... 127 8.1 ModelTopologies .......................................... 127 8.2 Modularization ............................................ 129 8.2.1 Context-IndependentSub-WordUnits................... 129 8.2.2 Context-DependentSub-WordUnits .................... 130 8.3 CompoundModels ......................................... 131 8.4 ProfileHMMs ............................................. 133 8.5 ModelingEmissions ........................................ 136 9 RobustParameterEstimation ................................... 137 9.1 FeatureOptimization ....................................... 139 9.1.1 Decorrelation ....................................... 140 PrincipalComponentAnalysisI........................ 141 Whitening.......................................... 144 9.1.2 DimensionalityReduction............................. 147 PrincipalComponentAnalysisII....................... 147 LinearDiscriminantAnalysis.......................... 148 9.2 Tying..................................................... 152 9.2.1 ModelSubunits...................................... 153 9.2.2 StateTying ......................................... 157 9.2.3 TyinginMixtureModels.............................. 161 9.3 InitializationofParameters .................................. 163 10 EfficientModelEvaluation...................................... 165 10.1 EfficientEvaluationofMixtureDensities....................... 166 10.2 BeamSearch .............................................. 167 10.3 EfficientParameterEstimation ............................... 170 10.3.1 Forward-BackwardPruning ........................... 170 10.3.2 SegmentalBaum-WelchAlgorithm ..................... 171 10.3.3 TrainingofModelHierarchies ......................... 173 10.4 Tree-likeModelOrganization ................................ 174 10.4.1 HMMPrefixTrees ................................... 174 10.4.2 Tree-likeRepresentationforn-GramModels............. 175 XII Contents 11 ModelAdaptation.............................................. 181 11.1 BasicPrinciples............................................ 181 11.2 AdaptationofHiddenMarkovModels ......................... 182 Maximum-LikelihoodLinear-Regression ...................... 184 11.3 Adaptationofn-GramModels................................ 186 11.3.1 CacheModels....................................... 187 11.3.2 Dialog-StepDependentModels ........................ 187 11.3.3 Topic-BasedLanguageModels......................... 187 12 IntegratedSearchMethods...................................... 189 12.1 HMMNetworks ........................................... 192 12.2 Multi-PassSearch .......................................... 193 12.3 SearchSpaceCopies........................................ 194 12.3.1 Context-BasedSearchSpaceCopies .................... 195 12.3.2 Time-BasedSearchSpaceCopies ...................... 196 12.3.3 Language-ModelLook-Ahead ......................... 197 12.4 Time-SynchronousParallelModelDecoding.................... 198 12.4.1 GenerationofSegmentHypotheses ..................... 199 12.4.2 LanguageModel-BasedSearch ........................ 200 PartIII Systems 13 SpeechRecognition............................................. 207 13.1 RecognitionSystemofRWTHAachenUniversity ............... 207 13.2 BBNSpeechRecognizerBYBLOS ........................... 209 13.3 ESMERALDA............................................. 210 14 CharacterandHandwritingRecognition.......................... 215 14.1 OCRSystembyBBN....................................... 215 14.2 DuisburgOnlineHandwritingRecognitionSystem............... 217 14.3 ESMERALDAOfflineRecognitionSystem..................... 219 15 AnalysisofBiologicalSequences ................................. 221 15.1 HMMER ................................................. 222 15.2 SAM ..................................................... 223 15.3 ESMERALDA............................................. 224 References......................................................... 227 Index ............................................................. 245

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.