ebook img

Broad Learning Through Fusions: An Application on Social Networks PDF

424 Pages·2019·10.09 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Broad Learning Through Fusions: An Application on Social Networks

Jiawei Zhang · Philip S. Yu Broad Learning Through Fusions An Application on Social Networks Broad Learning Through Fusions Jiawei Zhang • Philip S. Yu Broad Learning Through Fusions An Application on Social Networks 123 JiaweiZhang PhilipS.Yu DepartmentofComputerScience DepartmentofComputerScience FloridaStateUniversity UniversityofIllinois Tallahassee,FL,USA Chicago,IL,USA ISBN978-3-030-12527-1 ISBN978-3-030-12528-8 (eBook) https://doi.org/10.1007/978-3-030-12528-8 ©SpringerNatureSwitzerlandAG2019 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartofthematerialisconcerned, specificallytherightsoftranslation,reprinting,reuseofillustrations,recitation,broadcasting,reproductiononmicrofilmsorin anyotherphysicalway,andtransmissionorinformationstorageandretrieval,electronicadaptation,computersoftware,orby similarordissimilarmethodologynowknownorhereafterdeveloped. Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthispublicationdoesnotimply, evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfromtherelevantprotectivelawsandregulationsand thereforefreeforgeneraluse. Thepublisher,theauthors,andtheeditorsaresafetoassumethattheadviceandinformationinthisbookarebelievedtobetrue andaccurateatthedateofpublication.Neitherthepublishernortheauthorsortheeditorsgiveawarranty,expressorimplied, withrespecttothematerialcontainedhereinorforanyerrorsoromissionsthatmayhavebeenmade.Thepublisherremains neutralwithregardtojurisdictionalclaimsinpublishedmapsandinstitutionalaffiliations. ThisSpringerimprintispublishedbytheregisteredcompanySpringerNatureSwitzerlandAG. Theregisteredcompanyaddressis:Gewerbestrasse11,6330Cham,Switzerland Tomydearparents YinheZhangandZhulanLiu. JiaweiZhang Tomyfamily. PhilipS.Yu Preface Thistextbookiswrittenforthereaderswhoareinterestedinbroadlearning,especiallyininformation fusionandknowledgediscoveryacrossmultiplefusedinformationsources.Broadlearningisageneral learning problem, which can be studied in various disciplines. Meanwhile, to illustrate the problem settings and the learning algorithms more clearly, this book uses the online social network as an example. To make the textbook self-contained, the book also provides an overview of necessary backgroundknowledgeforthereaders.Ifitisthefirsttimeforthereaderstoreadatextbookrelated tobroadlearning,machinelearning,datamining,andsocialnetworkmining,thereaderswillfindthis booktobeveryeasytofollow. OverviewoftheBook There are 12 chapters in this textbook, which are divided into four main parts: Part I covers Chaps.1–3,whichintroducetheoverviewofbroadlearning,machinelearning,andsocialnetworks for the readers; Part II covers Chaps. 4–6, which include the existing social network alignment problems and algorithms; Part III covers Chaps. 7–11, which provide a comprehensive description about the recent social media mining problems across multiple information sources; Part IV covers Chap. 12, which indicates some potential future development of broad learning for the readers. ReadersandinstructorscanusethistextbookaccordingtotheguidanceprovidedinChap.1. Except Chap. 12, each chapter also has ten exercises for the readers. The exercise questions are divided into three levels: easy, medium, and hard, which are on the basic concepts, theorem proofs, algorithm details, as well as exercises on algorithm implementations. Some of the exercises can be usedintheafter-classhomework,andsomecanbeusedasthecourseprojectsinstead.Theinstructors candeterminehowtousetheexercisesaccordingtotheirdifficultylevels,aswellastheneedsofthe courses. Broad learning is a very large topic, and we cannot cover all the materials in this textbook. For the readers who want to further explore other related areas, at the end of each chapter, we provide a section about the bibliography notes. The readers can refer to the cited articles for more detailed informationaboutthematerialstoyourinterests. Acknowledgments Thisbookwouldnothavebeenpossiblewithoutmanycontributorswhosenamesdidnotmakeittothe cover.Wewillmentionthemhereaccordingtotheappearanceorderoftheircontributedworksinthis book. The active network alignment algorithm introduced in Sect. 6.4 is based on the collaborative work with Junxing Zhu (National University of Defense Technology) and the work with Yuxiang vii viii Preface Ren (Florida State University). The large-scale network synergistic community detection algorithm introducedinSect.8.5isbasedonthecollaborativeworkwithSongchangJin(NationalUniversityof DefenseTechnology).TheinformationdiffusionalgorithmintroducedinSect.9.4andviralmarketing algorithmintroducedinSect.10.4arebasedonthecollaborativeworkswithQianyiZhan(Jiangnan University). Thisbookhasbenefittedfromthesignificantfeedbacksfromourstudents,friends,andcolleagues. We would like to thank many people who help to read and review the book which greatly improve both the organization of the book and the detailed contents covered in the book. We would like to thank Lin Meng (Florida State University) for helping review Chaps. 3, 6, 7, and 11; Yuxiang Ren (FloridaStateUniversity)forreviewingChaps.2,4,5,and6;andQianyiZhan(JiangnanUniversity) forreviewingChaps.9and10. Jiawei would like to thank his long-term collaborators, including (sorted according to their last names)CharuC.Aggarwal,YiChang,JianhuiChen,BowenDong,YanjieFu,LifangHe,QingboHu, SongchangJin,XiangnanKong,MoyinLi,TaisongLi,KunpengLiu,YeLiu,YuanhuaLv,Guixiang Ma, Xiao Pan, Weixiang Shao, Chuan Shi, Weiwei Shi, Lichao Sun, Pengyang Wang, Sen-Zhang Wang, Yang Yang, Chenwei Zhang, Qianyi Zhan, Lei Zheng, Shi Zhi, Junxing Zhu, and Zhu-Hua Zhou.JiaweialsowantstothankhisPhDadvisorPhilipS.Yuforhisguidanceduringtheearlyyears asaresearcherandthemembersofhisIFMLab(InformationFusionandMiningLaboratory).Finally, Jiaweiwouldliketothankhisrespectedparents,YinheZhangandZhulanLiu,fortheirselflesslove andsupport.Thebookgrabssomuchtimethatshouldbespentinaccompanyingthem. Philip would like to thank his collaborators, including past and current members of his BDSC (BigDataandSocialComputing)LabatUIC. SpecialthanksareduetoMelissaFearonandCarolineFlanaganatSpringerPublishingCompany whoconvincedustogoaheadwiththisprojectandwereconstantlysupportiveandpatientintheface of recurring delays and missed deadlines. We thank Paul Drougas from Springer, our editor, for the helpinimprovingthebookconsiderably. ThisbookispartiallysupportedbyNSFthroughgrantsIIS-1763365andIIS-1763325. SomeOtherWords Broad learning is a fast-growing research area, and few people can have a very deep understanding about all the detailed materials covered in this book. Although the authors have several years of explorationexperiencesaboutthefrontierofthisarea,theremaystillexistsomemistakesandtypos in this book inevitably in writing. We are grateful if the readers can inform the authors about such mistakestheyfindwhenreadingthebook,whichwillhelpimprovethebookinthecomingeditions. Tallahassee,FL,USA JiaweiZhang Chicago,IL,USA PhilipS.Yu November11,2018 Contents PartI BackgroundIntroduction 1 BroadLearningIntroduction.................................................. 3 1.1 WhatIsBroadLearning .................................................. 3 1.2 ProblemsandChallengesofBroadLearning................................. 4 1.2.1 Cross-SourceInformationFusion................................... 4 1.2.2 Cross-SourceKnowledgeDiscovery ................................ 6 1.2.3 ChallengesofBroadLearning ..................................... 6 1.3 ComparisonwithOtherLearningTasks ..................................... 7 1.3.1 BroadLearningvs.DeepLearning.................................. 7 1.3.2 BroadLearningvs.EnsembleLearning.............................. 8 1.3.3 BroadLearningvs.TransferLearningvs.Multi-TaskLearning .......... 8 1.3.4 Broad Learning vs. Multi-View, Multi-Source, Multi-Modal, Multi-DomainLearning........................................... 9 1.4 BookOrganization ...................................................... 10 1.4.1 PartI .......................................................... 10 1.4.2 PartII.......................................................... 11 1.4.3 PartIII ......................................................... 11 1.4.4 PartIV......................................................... 12 1.5 WhoShouldReadThisBook ............................................. 12 1.6 HowtoReadThisBook .................................................. 13 1.6.1 ToReaders ..................................................... 13 1.6.2 ToInstructors ................................................... 13 1.6.3 SupportingMaterials ............................................. 14 1.7 Summary .............................................................. 14 1.8 BibliographyNotes...................................................... 14 1.9 Exercises .............................................................. 15 References................................................................... 15 2 MachineLearningOverview .................................................. 19 2.1 Overview .............................................................. 19 2.2 DataOverview.......................................................... 20 2.2.1 DataTypes ..................................................... 20 2.2.2 DataCharacteristics .............................................. 26 2.2.3 DataPre-processingandTransformation............................. 26 ix x Contents 2.3 SupervisedLearning:Classification ........................................ 32 2.3.1 ClassificationLearningTaskandSettings ............................ 33 2.3.2 DecisionTree ................................................... 34 2.3.3 SupportVectorMachine .......................................... 40 2.4 SupervisedLearning:Regression .......................................... 46 2.4.1 RegressionLearningTask ......................................... 46 2.4.2 LinearRegression................................................ 46 2.4.3 Lasso .......................................................... 47 2.4.4 Ridge .......................................................... 48 2.5 UnsupervisedLearning:Clustering......................................... 48 2.5.1 ClusteringTask.................................................. 49 2.5.2 K-Means ....................................................... 50 2.5.3 DBSCAN ...................................................... 52 2.5.4 Mixture-of-GaussianSoftClustering................................ 54 2.6 ArtificialNeuralNetworkandDeepLearning................................ 56 2.6.1 ArtificialNeuralNetworkOverview ................................ 57 2.6.2 DeepLearning .................................................. 62 2.7 EvaluationMetrics ...................................................... 66 2.7.1 ClassificationEvaluationMetrics................................... 66 2.7.2 RegressionEvaluationMetrics ..................................... 68 2.7.3 ClusteringEvaluationMetrics...................................... 69 2.8 Summary .............................................................. 71 2.9 BibliographyNotes...................................................... 72 2.10 Exercises .............................................................. 72 References................................................................... 73 3 SocialNetworkOverview ..................................................... 77 3.1 Overview .............................................................. 77 3.2 GraphEssentials ........................................................ 78 3.2.1 GraphRepresentations............................................ 78 3.2.2 ConnectivityinGraphs ........................................... 80 3.3 NetworkMeasures ...................................................... 82 3.3.1 Degree ......................................................... 82 3.3.2 Centrality....................................................... 85 3.3.3 Closeness....................................................... 91 3.3.4 TransitivityandSocialBalance..................................... 96 3.4 NetworkCategories ..................................................... 98 3.4.1 HomogeneousNetwork........................................... 99 3.4.2 HeterogeneousNetwork .......................................... 102 3.4.3 AlignedHeterogeneousNetworks .................................. 106 3.5 MetaPath.............................................................. 111 3.5.1 NetworkSchema ................................................ 111 3.5.2 MetaPathinHeterogeneousSocialNetworks ........................ 111 3.5.3 MetaPathAcrossAlignedHeterogeneousSocialNetworks............. 114 3.5.4 MetaPath-BasedNetworkMeasures ................................ 116 3.6 NetworkModels ........................................................ 117 3.6.1 RandomGraphModel ............................................ 118 3.6.2 PreferentialAttachmentModel..................................... 120 Contents xi 3.7 Summary .............................................................. 121 3.8 BibliographyNotes...................................................... 122 3.9 Exercises .............................................................. 123 References................................................................... 124 PartII InformationFusion:SocialNetworkAlignment 4 SupervisedNetworkAlignment................................................ 129 4.1 Overview .............................................................. 129 4.2 SupervisedNetworkAlignmentProblemDefinition........................... 130 4.3 SupervisedFullNetworkAlignment........................................ 131 4.3.1 FeatureExtractionforAnchorLinks ................................ 132 4.3.2 SupervisedAnchorLinkPredictionModel ........................... 138 4.3.3 StableMatching ................................................. 139 4.4 SupervisedPartialNetworkAlignment ..................................... 142 4.4.1 PartialNetworkAlignmentDescription.............................. 142 4.4.2 Inter-NetworkMetaPathBasedFeatureExtraction.................... 143 4.4.3 Class-ImbalanceClassificationModel............................... 146 4.4.4 GenericStableMatching.......................................... 149 4.5 AnchorLinkInferencewithCardinalityConstraint ........................... 151 4.5.1 LossFunctionforAnchorLinkPrediction ........................... 151 4.5.2 CardinalityConstraintDescription.................................. 153 4.5.3 JointOptimizationFunction ....................................... 154 4.5.4 ProblemandAlgorithmAnalysis ................................... 156 4.5.5 DistributedAlgorithm ............................................ 158 4.6 Summary .............................................................. 160 4.7 BibliographyNotes...................................................... 160 4.8 Exercises .............................................................. 161 References................................................................... 163 5 UnsupervisedNetworkAlignment ............................................. 165 5.1 Overview .............................................................. 165 5.2 HeuristicsBasedUnsupervisedNetworkAlignment .......................... 166 5.2.1 UserNamesBasedNetworkAlignmentHeuristics .................... 166 5.2.2 ProfileBasedNetworkAlignmentHeuristics ......................... 169 5.3 PairwiseHomogeneousNetworkAlignment ................................. 170 5.3.1 HeuristicsBasedNetworkAlignmentModel ......................... 171 5.3.2 IsoRank ........................................................ 173 5.3.3 IsoRankN ...................................................... 175 5.3.4 MatrixInferenceBasedNetworkAlignment.......................... 176 5.4 MultipleHomogeneousNetworkAlignmentwithTransitivityPenalty ........... 177 5.4.1 MultipleNetworkAlignmentProblemDescription .................... 178 5.4.2 UnsupervisedMultipleNetworkAlignment .......................... 179 5.4.3 TransitiveNetworkMatching ...................................... 184 5.5 HeterogeneousNetworkCo-alignment...................................... 186 5.5.1 NetworkCo-alignmentProblemDescription ......................... 187 5.5.2 AnchorLinkCo-inference......................................... 188 5.5.3 NetworkCo-matching ............................................ 195

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.