Table Of ContentSTATISTICAL AND
MACHINE LEARNING
APPROACHES FOR
NETWORK ANALYSIS
STATISTICAL AND
MACHINE LEARNING
APPROACHES FOR
NETWORK ANALYSIS
Editedby
MATTHIASDEHMER
UMIT – The Health and Life Sciences University, Institute for Bioinformatics and
TranslationalResearch,HallinTyrol,Austria
SUBHASHC.BASAK
NaturalResourcesResearchInstitute
UniversityofMinnesota,Duluth
Duluth,MN,USA
Copyright©2012byJohnWiley&Sons,Inc.Allrightsreserved
PublishedbyJohnWiley&Sons,Inc.,Hoboken,NewJersey
PublishedsimultaneouslyinCanada
Nopartofthispublicationmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformor
byanymeans,electronic,mechanical,photocopying,recording,scanning,orotherwise,exceptas
permittedunderSection107or108ofthe1976UnitedStatesCopyrightAct,withouteithertheprior
writtenpermissionofthePublisher,orauthorizationthroughpaymentoftheappropriateper-copyfeeto
theCopyrightClearanceCenter,Inc.,222RosewoodDrive,Danvers,MA01923,(978)750-8400,fax
(978)750-4470,oronthewebatwww.copyright.com.RequeststothePublisherforpermissionshould
beaddressedtothePermissionsDepartment,JohnWiley&Sons,Inc.,111RiverStreet,Hoboken,NJ
07030,(201)748-6011,fax(201)748-6008,oronlineathttp://www.wiley.com/go/permission.
LimitofLiability/DisclaimerofWarranty:Whilethepublisherandauthorhaveusedtheirbesteffortsin
preparingthisbook,theymakenorepresentationsorwarrantieswithrespecttotheaccuracyor
completenessofthecontentsofthisbookandspecificallydisclaimanyimpliedwarrantiesof
merchantabilityorfitnessforaparticularpurpose.Nowarrantymaybecreatedorextendedbysales
representativesorwrittensalesmaterials.Theadviceandstrategiescontainedhereinmaynotbesuitable
foryoursituation.Youshouldconsultwithaprofessionalwhereappropriate.Neitherthepublishernor
authorshallbeliableforanylossofprofitoranyothercommercialdamages,includingbutnotlimitedto
special,incidental,consequential,orotherdamages.
Forgeneralinformationonourotherproductsandservicesorfortechnicalsupport,pleasecontactour
CustomerCareDepartmentwithintheUnitedStatesat(800)762-2974,outsidetheUnitedStatesat(317)
572-3993orfax(317)572-4002.
Wileyalsopublishesitsbooksinavarietyofelectronicformats.Somecontentthatappearsinprintmay
notbeavailableinelectronicformats.FormoreinformationaboutWileyproducts,visitourwebsiteat
www.wiley.com.
LibraryofCongressCataloging-in-PublicationData:
ISBN:978-0-470-19515-4
PrintedintheUnitedStatesofAmerica
10 9 8 7 6 5 4 3 2 1
ToChristina
CONTENTS
Preface ix
Contributors xi
1 ASurveyofComputationalApproachestoReconstructand
PartitionBiologicalNetworks 1
LipiAcharya,ThairJudeh,andDongxiaoZhu
2 IntroductiontoComplexNetworks:Measures,
StatisticalProperties,andModels 45
KazuhiroTakemotoandChikooOosawa
3 ModelingforEvolvingBiologicalNetworks 77
KazuhiroTakemotoandChikooOosawa
4 ModularityConfigurationsinBiologicalNetworkswith
EmbeddedDynamics 109
EnricoCapobianco,AntonellaTravaglione,andElisabettaMarras
5 InfluenceofStatisticalEstimatorsontheLarge-Scale
CausalInferenceofRegulatoryNetworks 131
RicardodeMatosSimoesandFrankEmmert-Streib
vii
viii CONTENTS
6 WeightedSpectralDistribution:AMetricforStructural
AnalysisofNetworks 153
DamienFay,HamedHaddadi,AndrewW.Moore,RichardMortier,
AndrewG.Thomason,andSteveUhlig
7 TheStructureofanEvolvingRandomBipartiteGraph 191
ReinhardKutzelnigg
8 GraphKernels 217
MatthiasRupp
9 Network-BasedInformationSynergyAnalysisfor
AlzheimerDisease 245
XueweiWang,HiroshaGeekiyanage,andChristinaChan
10 Density-BasedSetEnumerationinStructuredData 261
ElisabethGeorgiiandKojiTsuda
11 HyponymExtractionEmployingaWeightedGraphKernel 303
TimvorderBru¨ck
Index 327
PREFACE
Anemergingtrendinmanyscientificdisciplinesisastrongtendencytowardbeing
transformedintosomeformofinformationscience.Oneimportantpathwayinthis
transitionhasbeenviatheapplicationofnetworkanalysis.Thebasicmethodologyin
thisareaistherepresentationofthestructureofanobjectofinvestigationbyagraph
representingarelationalstructure.Itisbecauseofthisgeneralnaturethatgraphshave
beenusedinmanydiversebranchesofscienceincludingbioinformatics,molecular
andsystemsbiology,theoreticalphysics,computerscience,chemistry,engineering,
drugdiscovery,andlinguistics,tonamejustafew.Animportantfeatureofthebook
“StatisticalandMachineLearningApproachesforNetworkAnalysis”istocombine
theoretical disciplines such as graph theory, machine learning, and statistical data
analysis and, hence, to arrive at a new field to explore complex networks by using
machinelearningtechniquesinaninterdisciplinarymanner.
The age of network science has definitely arrived. Large-scale generation of
genomic, proteomic, signaling, and metabolomic data is allowing the construction
ofcomplexnetworksthatprovideanewframeworkforunderstandingthemolecular
basisofphysiologicalandpathologicalstates.Networksandnetwork-basedmethods
havebeenusedinbiologytocharacterizegenomicandgeneticmechanismsaswell
asproteinsignaling.Diseasesarelookeduponasabnormalperturbationsofcritical
cellularnetworks.Onset,progression,andinterventionincomplexdiseasessuchas
canceranddiabetesareanalyzedtodayusingnetworktheory.
Once the system is represented by a network, methods of network analysis can
beappliedtoextractusefulinformationregardingimportantsystempropertiesandto
investigateitsstructureandfunction.Variousstatisticalandmachinelearningmethods
havebeendevelopedforthispurposeandhavealreadybeenappliedtonetworks.The
purposeofthebookistodemonstratetheusefulness,feasibility,andtheimpactofthe
ix
x PREFACE
methodsonthescientificfield.The11chaptersinthisbookwrittenbyinternationally
reputedresearchersinthefieldofinterdisciplinarynetworktheorycoverawiderange
oftopicsandanalysismethodstoexplorenetworksstatistically.
Thetopicswearegoingtotackleinthisbookrangefromnetworkinferenceand
clustering,graphkernelstobiologicalnetworkanalysisforcomplexdiseasesusing
statistical techniques. The book is intended for researchers, graduate and advanced
undergraduate students in the interdisciplinary fields such as biostatistics, bioinfor-
matics, chemistry, mathematical chemistry, systems biology, and network physics.
Eachchapteriscomprehensivelypresented,accessiblenotonlytoresearchersfrom
thisfieldbutalsotoadvancedundergraduateorgraduatestudents.
Many colleagues, whether consciously or unconsciously, have provided us with
input, help, and support before and during the preparation of the present book. In
particular,wewouldliketothankMariaandGheorgheDuca,FrankEmmert-Streib,
BorisFurtula,IvanGutman,ArminGraber,MartinGrabner,D.D.Lozovanu,Alexei
Levitchi, Alexander Mehler, Abbe Mowshowitz, Andrei Perjan, Ricardo de Matos
Simoes,FredSobik,DongxiaoZhu,andapologizetoallwhohavenotbeennamed
mistakenly.MatthiasDehmerthanksChristinaUhdeforgivingloveandinspiration.
WealsothankFrankEmmert-Streibforfruitfuldiscussionsduringtheformationof
thisbook.
WewouldalsoliketothankoureditorSusanneSteitz-FillerfromWileywhohas
been always available and helpful. Last but not the least, Matthias Dehmer thanks
theAustrianScienceFunds(projectP22029-N13)andtheStandortagenturTirolfor
supportingthiswork.
Finally, we sincerely hope that this book will serve the scientific community of
networksciencereasonablywellandinspirespeopletousemachinelearning-driven
networkanalysistosolveinterdisciplinaryproblemssuccessfully.
MatthiasDehmer
SubhashC.Basak
CONTRIBUTORS
LipiAcharya, DepartmentofComputerScience,UniversityofNewOrleans,New
Orleans,LA,USA
Enrico Capobianco, Laboratory for Integrative Systems Medicine (LISM)
IFC-CNR, Pisa (IT); Center for Computational Science, University of Miami,
Miami,FL,USA
Christina Chan, Departments of Chemical Engineering and Material Sciences,
Genetics Program, Computer Science and Engineering, and Biochemistry and
MolecularBiology,MichiganStateUniversity,EastLansing,MI,USA
Ricardo de Matos Simoes, Computational Biology and Machine Learning Lab,
CenterforCancerResearchandCellBiology,SchoolofMedicine,Dentistryand
BiomedicalSciences,Queen’sUniversityBelfast,UK
Frank Emmert-Streib, Computational Biology and Machine Learning Lab,
CenterforCancerResearchandCellBiology,SchoolofMedicine,Dentistryand
BiomedicalSciences,Queen’sUniversityBelfast,UK
Damien Fay, Computer Laboratory, Systems Research Group, University of
Cambridge,UK
HiroshaGeekiyanage, GeneticsProgram,MichiganStateUniversity,EastLansing,
MI,USA
Elisabeth Georgii, Department of Information and Computer Science, Helsinki
Institute for Information Technology, Aalto University School of Science and
Technology,Aalto,Finland
xi
xii CONTRIBUTORS
Hamed Haddadi, Computer Laboratory, Systems Research Group, University of
Cambridge,UK
Thair Judeh, Department of Computer Science, University of New Orleans, New
Orleans,LA,USA
ReinhardKutzelnigg, Math.Tec,Heumühlgasse,Wien,Vienna,Austria
Elisabetta Marras, CRS4 Bioinformatics Laboratory, Polaris Science and
TechnologyPark,Pula,Italy
AndrewW.Moore, SchoolofComputerScience,CarnegieMellonUniversity,USA
RichardMortier, HorizonInstitute,UniversityofNottingham,UK
ChikooOosawa, DepartmentofBioscienceandBioinformatics,KyushuInstituteof
Technology,Iizuka,Fukuoka820-8502,Japan
Matthias Rupp, Machine Learning Group, Berlin Institute of Technology, Berlin,
Germany,and,InstituteofPureandAppliedMathematics,UniversityofCalifornia,
LosAngeles,CA,USA;currentlyattheInstituteofPharmaceuticalSciences,ETH
Zurich,Zurich,Switzerland.
Kazuhiro Takemoto, Department of Bioscience and Bioinformatics, Kyushu
Institute of Technology, Iizuka, Fukuoka 820-8502, Japan; PRESTO, Japan
ScienceandTechnologyAgency,Kawaguchi,Saitama332-0012,Japan
Andrew G. Thomason, Department of Pure Mathematics and Mathematical
Statistics,UniversityofCambridge,UK
Antonella Travaglione, CRS4 Bioinformatics Laboratory, Polaris Science and
TechnologyPark,Pula,Italy
Koji Tsuda, Computational Biology Research Center, National Institute of
AdvancedIndustrialScienceandTechnologyAIST,Tokyo,Japan
SteveUhlig, SchoolofElectronicEngineeringandComputerScience,QueenMary
UniversityofLondon,UK
TimvorderBru¨ck, DepartmentofComputerScience,TextTechnologyLab,Johann
WolfgangGoetheUniversity,Frankfurt,Germany
Xuewei Wang, Department of Chemical Engineering and Material Sciences,
MichiganStateUniversity,EastLansing,MI,USA
Dongxiao Zhu, Department of Computer Science, University of New Orleans;
ResearchInstituteforChildren,Children’sHospital;TulaneCancerCenter,New
Orleans,LA,USA
Description:Explore the multidisciplinary nature of complex networks through machine learning techniquesStatistical and Machine Learning Approaches for Network Analysis provides an accessible framework for structurally analyzing graphs by bringing together known and novel approaches on graph classes and graph m