ebook img

The Beauty of Mathematics in Computer Science PDF

285 Pages·2018·3.54 MB·English
by  Jun Wu
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The Beauty of Mathematics in Computer Science

THE BEAUTY OF MATHEMATICS IN COMPUTER SCIENCE Jun Wu A Chapman & Hall Book The Beauty of Mathematics in Computer Science The Beauty of Mathematics in Computer Science Jun Wu TranslatedfromtheChineseeditionby Rachel Wu and Yuxi Candice Wang CRCPress Taylor&FrancisGroup 6000BrokenSoundParkwayNW,Suite300 BocaRaton,FL33487-2742 ©2019byTaylor&FrancisGroup,LLC CRCPressisanimprintofTaylor&FrancisGroup,anInformabusiness NoclaimtooriginalU.S.Governmentworks Printedonacid-freepaper InternationalStandardBookNumber-13:978-1-138-04960-4(Paperback) 978-1-138-04967-3(Hardback) This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannotassume responsibility forthe validityofallmaterialsortheconsequencesoftheiruse.The authorsandpublishershaveattemptedtotracethecopyrightholdersofallmaterialreproducedin this publication and apologize to copyright holders if permission to publish in this form has not beenobtained.Ifanycopyrightmaterialhasnotbeenacknowledgedpleasewriteandletusknowso wemayrectifyinanyfuturereprint. ExceptaspermittedunderU.S.CopyrightLaw,nopartofthisbookmaybereprinted,reproduced, transmitted,orutilizedinanyformbyanyelectronic,mechanical,orothermeans,nowknownorhere- afterinvented,includingphotocopying,microfilming,andrecording,orinanyinformationstorageor retrievalsystem,withoutwrittenpermissionfromthepublishers. For permission to photocopy or use material electronically from this work, please access www. copyright.com(http://www.copyright.com/)orcontacttheCopyrightClearanceCenter,Inc.(CCC), 222RosewoodDrive,Danvers,MA01923,978-750-8400.CCCisanot-for-profitorganizationthat provideslicensesandregistrationforavarietyofusers.Fororganizationsthathavebeengranteda photocopylicensebytheCCC,aseparatesystemofpaymenthasbeenarranged. TrademarkNotice:Productorcorporatenamesmaybetrademarksorregisteredtrademarks,andare usedonlyforidentificationandexplanationwithoutintenttoinfringe. LibraryofCongressCataloging-in-PublicationData Names:Wu,Jun,1967-author. Title:Thebeautyofmathematicsincomputerscience/JunWu. Description:BocaRaton,FL:Taylor&FrancisGroup,2019. Identifiers:LCCN2018035719|ISBN9781138049604paperback|ISBN 9781138049673hardback Subjects:LCSH:Computerscience--Mathematics.|Machinelearning. Classification:LCCQA76.9.M35W842019|DDC004.01/51--dc23 LCrecordavailableathttps://lccn.loc.gov/2018035719 VisittheTaylor&FrancisWebsiteat http:==========www.taylorandfrancis.com andtheCRCPressWebsiteat http:==========www.crcpress.com Contents Foreword......................................................... xi Preface .......................................................... xiii Acknowledgments................................................xv 1 Wordsandlanguages,numbersandinformation ............... 1 1.1 Information...................................................... 1 1.2 Wordsandnumbers ............................................. 2 1.3 Themathematicsbehindlanguage................................ 8 1.4 Summary.......................................................11 2 Naturallanguageprocessing—Fromrulestostatistics ........13 2.1 Machineintelligence ............................................13 2.2 Fromrulestostatistics..........................................19 2.3 Summary.......................................................22 3 Statisticallanguagemodel ....................................23 3.1 Describinglanguagethroughmathematics .......................23 3.2 Extendedreading:Implementationcaveats ......................27 3.2.1 Higherorderlanguagemodels ............................27 3.2.2 Trainingmethods,zero-probabilityproblems, andsmoothing ...........................................28 3.2.3 Corpusselection .........................................32 3.3 Summary.......................................................33 Bibliography.........................................................33 4 Wordsegmentation ...........................................35 4.1 EvolutionofChinesewordsegmentation.........................35 4.2 Extendedreading:Evaluatingresults ............................39 4.2.1 Consistency..............................................39 4.2.2 Granularity ..............................................40 4.3 Summary.......................................................41 Bibliography.........................................................41 v vi Contents 5 HiddenMarkovmodel ........................................43 5.1 Communicationmodels .........................................43 5.2 HiddenMarkovmodel ..........................................45 5.3 Extendedreading:HMMtraining ...............................48 5.4 Summary.......................................................50 Bibliography.........................................................50 6 Quantifyinginformation ......................................53 6.1 Informationentropy ............................................53 6.2 Roleofinformation .............................................55 6.3 Mutualinformation .............................................57 6.4 Extendedreading:Relativeentropy .............................59 6.5 Summary.......................................................61 Bibliography.........................................................62 7 Jelinekandmodernlanguageprocessing ......................63 7.1 Earlylife .......................................................64 7.2 FromWatergatetoMonicaLewinsky ...........................66 7.3 Anoldman’smiracle ...........................................68 8 Booleanalgebraandsearchengines ...........................71 8.1 Booleanalgebra.................................................72 8.2 Indexing........................................................74 8.3 Summary.......................................................76 9 Graphtheoryandwebcrawlers ...............................77 9.1 Graphtheory ...................................................77 9.2 Webcrawlers ...................................................79 9.3 Extendedreading:Twotopicsingraphtheory ...................79 9.3.1 Euler’sproofoftheKönigsbergbridges ...................79 9.3.2 Theengineeringofawebcrawler .........................80 9.4 Summary.......................................................82 10 PageRank:Google’sdemocraticrankingtechnology .........83 10.1 ThePageRankalgorithm .....................................83 10.2 Extendedreading:PageRankcalculations .....................86 10.3 Summary.....................................................87 Bibliography........................................................87 11 Relevanceinwebsearch .....................................89 11.1 TF-IDF ......................................................90 Contents vii 11.2 Extendedreading:TF-IDFandinformationtheory ............92 11.3 Summary.....................................................93 Bibliography........................................................93 12 Finitestatemachinesanddynamicprogramming: NavigationinGoogleMaps ..................................95 12.1 Addressanalysisandfinitestatemachines .....................96 12.2 Globalnavigationanddynamicprogramming .................98 12.3 Finitestatetransducer.......................................101 12.4 Summary....................................................102 Bibliography.......................................................103 13 Google’sAK-47designer,Dr.AmitSinghal .................105 14 Cosinesandnewsclassification..............................109 14.1 Featurevectorsfornews .....................................109 14.2 Vectordistance ..............................................111 14.3 Extendedreading:Theartofcomputingcosines ..............114 14.3.1 Cosinesinbigdata.................................. 114 14.3.2 Positionalweighting ................................ 115 14.4 Summary....................................................116 15 Solvingclassificationproblemsintextprocessing withmatrices ...............................................117 15.1 Matricesofwordsandtexts ..................................117 15.2 Extendedreading:Singularvaluedecomposition methodandapplications .....................................120 15.3 Summary....................................................121 Bibliography.......................................................121 16 Informationfingerprintinganditsapplication ..............123 16.1 Informationfingerprint ......................................123 16.2 Applicationsofinformationfingerprint .......................125 16.2.1 Determiningidenticalsets........................... 125 16.2.2 Detectingsimilarsets ............................... 126 16.2.3 YouTube’santi-piracy .............................. 127 16.3 Extendedreading:Informationfingerprint’srepeatability andSimHash ................................................128 16.3.1 Probabilityofrepeatedinformationfingerprint ...... 128 16.3.2 SimHash............................................ 129 16.4 Summary....................................................131 Bibliography.......................................................132 viii Contents 17 ThoughtsinspiredbytheChineseTVseriesPlot: Themathematicalprinciplesofcryptography...............133 17.1 Thespontaneouseraofcryptography ........................133 17.2 Cryptographyintheinformationage .........................136 17.3 Summary....................................................140 18 Notallthatglittersisgold:Searchengine’santi-SPAM problemandsearchresultauthoritativenessquestion .......141 18.1 Searchengineanti-SPAM ....................................141 18.2 Authoritativenessofsearchresults ...........................145 18.3 Summary....................................................148 19 Discussionontheimportanceofmathematicalmodels ......149 20 Don’tputallyoureggsinonebasket:Theprinciple ofmaximumentropy........................................157 20.1 Principleofmaximumentropyandmaximum entropymodel ...............................................158 20.2 Extendedreading:Maximumentropymodeltraining .........160 20.3 Summary....................................................162 Bibliography.......................................................162 21 MathematicalprinciplesofChineseinputmethodeditors...163 21.1 Inputmethodandcoding ....................................163 21.2 HowmanykeystrokestotypeaChinesecharacter? DiscussiononShannon’sFirstTheorem ......................166 21.3 Thealgorithmofphonetictranscription ......................169 21.4 Extendedreading:Personalizedlanguagemodels .............170 21.5 Summary....................................................172 22 Bloomfilters ................................................173 22.1 TheprincipleofBloomfilters ................................173 22.2 Extendedreading:Thefalsealarmproblemof Bloomfilters.................................................175 22.3 Summary....................................................177 23 Bayesiannetwork:ExtensionofMarkovchain ..............179 23.1 Bayesiannetwork............................................179 23.2 Bayesiannetworkapplicationinwordclassification ...........182 23.3 Extendedreading:TrainingaBayesiannetwork ..............184 23.4 Summary....................................................185 Contents ix 24 Conditionalrandomfields,syntacticparsing,andmore .....187 24.1 Syntacticparsing—theevolutionofcomputeralgorithms .....187 24.2 Conditionalrandomfields....................................190 24.3 Conditionalrandomfieldapplicationsinotherfields ..........193 24.4 Summary....................................................195 25 AndrewViterbiandtheViterbialgorithm ..................197 25.1 TheViterbialgorithm .......................................197 25.2 CDMAtechnology:Thefoundationof3Gmobile communication ..............................................201 25.3 Summary....................................................205 26 God’salgorithm:Theexpectation-maximization algorithm ...................................................207 26.1 Self-convergeddocumentclassification........................207 26.2 Extendedreading:Convergenceofexpectation- maximizationalgorithms.....................................210 26.3 Summary....................................................212 27 Logisticregressionandwebsearchadvertisement...........213 27.1 Theevaluationofwebsearchadvertisement ..................213 27.2 Thelogisticmodel ...........................................214 27.3 Summary....................................................216 28 GoogleBrainandartificialneuralnetworks .................217 28.1 Artificialneuralnetwork .....................................217 28.2 Traininganartificialneuralnetwork .........................226 28.3 Therelationshipbetweenartificialneuralnetworks andBayesiannetworks ......................................228 28.4 Extendedreading:“GoogleBrain” ............................229 28.5 Summary....................................................233 Bibliography.......................................................234 29 Thepowerofbigdata.......................................235 29.1 Theimportanceofdata ......................................235 29.2 Statisticsandinformationtechnology ........................239 29.3 Whyweneedbigdata .......................................247 29.4 Summary....................................................254 Bibliography.......................................................254 Postscript .......................................................255 Index ...........................................................259

Description:
The Beauty of Mathematics in Computer Science explains the mathematical fundamentals of information technology products and services we use every day, from Google Web Search to GPS Navigation, and from speech recognition to CDMA mobile services. The book was published in Chinese in 2011 and has sold
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.