ebook img

Infinite Multiple Membership Relational Modeling for Complex Networks PDF

0.18 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Infinite Multiple Membership Relational Modeling for Complex Networks

Infinite Multiple Membership Relational Modeling for Complex Networks MortenMørup,MikkelN.SchmidtandLarsKaiHansen CognitiveSystemsGroup 1 TechnicalUniversityofDenmark 1 {mm,mns,lkh}@imm.dtu.dk 0 2 n a Abstract J 6 Learninglatentstructureincomplexnetworkshasbecomeanimportantproblem 2 fueled by many types of networked data originating from practically all fields of science. In this paper, we propose a new non-parametricBayesian multiple- ] I membership latent feature model for networks. Contrary to existing multiple- S membership models that scale quadratically in the number of vertices the pro- . s posedmodelscaleslinearlyinthenumberoflinksadmittingmultiple-membership c analysisin largescale networks. We demonstratea connectionbetweenthe sin- [ glemembershiprelationalmodelandmultiplemembershipmodelsandshowon 1 “real”sizebenchmarknetworkdatathataccountingformultiplemembershipsim- v provesthelearningoflatentstructureasmeasuredbylinkpredictionwhileexplic- 7 itlyaccountingformultiplemembershipresultinamorecompactrepresentation 9 ofthelatentstructureofnetworks. 0 5 . 1 0 1 Introduction 1 1 The analysis of complexnetworkshasbecome an importantchallenge spurredby the many types : v of networkeddata arising in practically all fields of science. These networksare very differentin i nature ranging from biology networks such as protein interaction [18, 1] and the connectome of X neuronalconnectivity[19]totheanalysisofinteractionbetweenlargegroupsofagentsinsocialand r technologynetworks[14,19,9,20].Manyofthenetworksexhibitastrongdegreeofstructure;thus, a learningthisstructurefacilitatesboththeunderstandingofnetworkdynamics,theidentificationof linkdensityheterogeneities,aswellasthepredictionof“missing”links. WewillrepresentanetworkasagraphG = (V,Y)whereV = {v ,...,v }isthesetofvertices 1 N andY isthesetofobservedlinksandnon-links. LetY ∈ {0,1,?}N×N denotealink(adjacency) matrixwheretheelementy =1ifthereisalinkbetweenvertexv andv ,y =0ifthereisnota ij i j ij link,andy =?iftheexistenceofalinkisunobserved.Furthermore,letY ,Y ,andY denotethe ij 1 0 ? setoflinks,non-links,andunobservedlinksinthegraphrespectively. Overtheyears,amultitudeofmethodsforidentifyinglatentstructureingraphshavebeenproposed, most of which are based on grouping the vertices for the identification of homogeneousregions. Traditionally,thishasbeenbasedonvariouscommunitydetectionapproacheswhereacommunity isdefinedasadenselyconnectedsubsetofverticesthatissparselylinkedtotheremainingnetwork [15, 17]. These structures have for instance been identified by splitting the graph using spectral approaches,analyzingflows,andthroughtheanalysisoftheHamiltonian. Modularityoptimization [15]isaspecialcase thatmeasuresthedeviationofthefractionoflinkswithincommunitiesfrom theexpectedfractionofsuchlinksbasedontheirdegreedistribution[15,17]. Adrawback,however, forthesetypesofanalysesisthattheyarebasedonheuristicsanddonotcorrespondtoanunderlying generativeprocess. 1 v v v v v 1 1 2 4 5 v 2 v v v 3 6 3 v 4 Y v7 v8 vv5 6 v v 9 7 v 8 v 9 Figure1: Left: Exampleofa simplegraphwhereeachoftheverticeshavemultiplememberships indicatedbycolors.Right: Thecorrespondingassignmentmatrix. Probabilistic generative models: Recently, generativemodels for complexnetworks have been proposed where links are drawn according to conditionally independentBernoulli densities, such thattheprobabilityofobservingalinky isgivenbyπ , ij ij p(Y|Π)= Y πiyjij(1−πij)1−yij. (1) (i,j)∈Y In the classical Erdo˝s-Re´nyirandomgraphmodel, each link is includedindependentlywith equal probabilityπ =π ;however,moreexpressivemodelsareneededinordertomodelcomplexlatent ij 0 structureofgraphs.Inthefollowing,wefocusontworelatedmethods:latentclassandlatentfeature models. Latentclassmodels: Inlatentclassmodels,suchasthestochasticblockmodel[16],alsodenoted the relationalmodel (RM), each vertex vi belongs to a class ci, and the probability, πij, of a link betweenv andv isdeterminedbytheclassassignmentsc andc asπ =ρ .Here,ρ ∈[0,1] i j i j ij cicj kℓ denotes the probability of generating a link between a vertex in class k and a vertex in class ℓ. Inferenceinlatentclassmodelsinvolvesdeterminingtheclassassignmentsaswellastheclasslink probabilities.Basedonthis,communitiescanbefoundas(groupsof)classeswithhighinternaland lowexternallinkprobability. In the model proposed by [7] (HW) the class link probability, ρkℓ, is specified by a within-class probabilityη andabetween-classprobabilityη ,i.eρ =η (1−δ )+η δ .. Anotherintuitive c n kℓ n kℓ c kℓ representation,whichwerefertoasDB,istohaveasharedbetween-classprobabilitybutallowfor individualwithin-classprobabilities,i.eρ = η (1−δ )+η δ . Bothoftheserepresentations kℓ n kℓ k kℓ areconsistentwiththenotionofcommunitieswithhighinternalandlowexternallinkdensity,and restrictingthenumberofinteractionparameterscanfacilitatemodelinterpretationcomparedtothe generalRM. Based on the Dirichlet process, [9, 20] propose a non-parametricgeneralization of the stochastic blockmodelwithapotentiallyinfinitenumberofclassesdenotedtheinfiniterelationalmodel(IRM) andinfinitehiddenrelationalmodelrespectively.Thelattergeneralizingthestochasticblockmodel tosimultaneouslymodelpotentialvertexattributes. InferenceinIRMjointlydeterminesthenumber of latent classes as well as class assignments and class link probabilities. This approach readily generalizestotheHWandDBparameterizationsofρ. Latent feature models: In latent feature models, the assumption that each vertex belongs to a single class is relaxed. Instead it is assumed that each vertexv has an associated featurez , and i i thatprobabilitiesoflinksare determinedbased oninteractionsbetweenfeatures. Thisgeneralizes thelatentclassmodels,whicharethespecialcasewherethefeaturesarebinaryvectorswithexactly onenon-zeroelement. Manylatentfeaturemodelssupportthe notionofdiscreteclasses, butallowformixedormultiple memberships(seeFigure1foranillustrationofanetworkwithmultipleclassmemberships).Inthe mixedmembershipstochastic block model(MMSB) [1] the verticesare allowed to have fractional classmemberships.Inbinarymatrixfactorization[11]multiplemembershipsareexplicitlymodeled suchthateachvertexcanbeassignedtomultipleclustersbyaninfinitelatentfeaturemodelbased ontheIndianbuffetprocess(IBP)[6]. [12]studythisapproach,forthespecificcaseofaBernoulli likelihood, Eq. (1), and extend the method to include additionalside information as covariatesin modelingthe linkprobabilities. In their model, the probabilityof a link π is specified byπ = ij ij 2 f ( z z w +s ), where f (·) is a functionwith range [0,1] such as a sigmoid, and w σ kℓ ik jℓ kℓ ij σ kℓ arewPeightsthataffectstheprobabilityofgeneratingalinkbetweenverticesinclusterkandℓ. The terms accountsforbiasaswellasadditionalside-information. Forexample,ifcovariatesφ are ij i available foreach vertexv , [12] suggestincludingthe term s = βd(φ ,φ )+β⊤φ +β⊤φ , i ij i j i i j j whereβ,β ,andβ areregressionparameters,andd(·,·)issomepossiblynonlinearfunction. i j In general, the computationalcost of the single membershipclustering methodsmentionedabove scales linearly in the number of links in the graph. Unfortunately, existing multiple membership models[1,11,12]scalequadraticallyinthenumberofvertices,becausetheyrequireexplicitcompu- tationsforalllinksandnon-links.Thisrendersexistingmultiplemembershipmodelingapproaches infeasibleforlargenetworks. Furthermore,determiningthemultiplemembershipassignmentsisa combinatorialchallenge as the number of potential states grow as 2KN rather than KN in single membershipmodels. In particular, standard Gibbssampling approachestend to get stuck in local suboptimalconfigurationswheresingleassignmentchangesarenotadequateforthe identification ofprobablealternativeconfigurations[11]. Consequently,thereisbothaneedforcomputationally efficientmodelsthatscalelinearlyinthenumberoflinksaswellasreliableinferenceschemesfor modelingmultiplememberships. In this paper, we proposea new non-parametricBayesian latentfeature graphmodel, denotedthe infinitemultiplerelationalmodel(IMRM), thataddressesthechallengesmentionedabove. Specifi- cally,thecontributionsinthispaperarethefollowing: i)WeproposetheIMRMinwhichinference scaleslinearlyinthenumberoflinks. ii)Weproposeanon-conjugatesplit-mergesamplingproce- dureforparameterinference.iii)WedemonstratehowthesinglemembershipIRMmodelimplicitly accounts for multiple memberships. iv) We compare existing non-parametric single membership modelswithourproposedmultiplemembershipcounterpartsinlearninglatentstructureofavariety ofbenchmark”real”sizenetworksanddemonstratethatexplicitlymodelingmultiple-membership resultsinmorecompactrepresentationsoflatentstructure. 2 Infinite multiple-membership relational model Given a graph, assume that each vertex v has an associated K-dimensionalbinary latent feature i vector,z ,withK = |z | assignments. Considervertexv andv : ForallK K combinationsof i i i 1 i j i j classes there is an associated probability, ρ , of generatinga link. We assume thateach of these kℓ combinationsofclassesactindependentlytogeneratealinkbetweenv andv , suchthatthetotal i j probability,π ,ofgeneratingalinkbetweenv andv isgivenby ij i j πij =1−(1−σij)Y(1−ρkℓ)zikzjℓ, (2) kℓ where σ is an optional term that can be used to account for noise or to include further side- ij information as discussed previously. Under this model, the features act as independentcauses of links,andthusifavertexgetsanadditionalfeatureitwillresultinanincreasedprobabilityoflink- ing to other vertices. In contrast to the model proposed by [12], where negative weights leads to features that inhibit links, our model is more restricted. Although this might result in less power toexplaindata,weexpectthatitwillbeeasiertointerpretthefeaturesinourmodelbecauselinks aredirectlygeneratedbyindividualfeaturesandnotthroughcomplexinteractionsbetweenfeatures. Thisisanalogoustonon-negativematrixfactorizationthatisknowntoformparts-basedrepresen- tationbecauseitdoesnotallowcomponentcancellations[10]. Ifthelatentfeaturesz haveonlya i singleactiveelementandσ = 0,Eq.(2)reducestoπ = ρ ,i.e.,theproposedmodeldirectly ij ij cicj generalizestheIRMmodel;hence,wedenoteourmodeltheinfinitemultiple-membershiprelational model(IMRM). The link probability model in Eq. (2) has a very attractive computational property. In many real data sets, the number of non-links far exceeds the number of links present in the network. To analyzelargescalenetworkswherethisholdsitisagreatadvantagetodevisealgorithmsthatscale computationallyonly with the number of links present. As we show in the following, our model has that property. Assuming σ = 0 for simplicity of presentation, we may write Eq. (2) more ij compactlyas πij =1−ez⊤iPzj, 3 wheretheelementsofthematrixP arep =log(1−ρ ). InsertingthisinEq.(1)wehave kℓ kl p(Y|Z,P)=Y(cid:16)1−ez⊤iPzj(cid:17)yij(cid:16)ez⊤iPzj(cid:17)1−yij =hY(1−ez⊤iPzj)iexphXz⊤iPzji. (3) (i,j)∈Y (i,j)∈Y1 (i,j)∈Y0 Theexponentofthesecondterm,whichentailsasumoverthepossiblylargesetofnon-linksinthe network,canbeefficientlycomputedas N N p z z − z z , (4) X kℓ(cid:16)X ikX jℓ X ik jℓ(cid:17) kℓ i=1 j=1 (i,j)∈Y1∪Y? requiringonlysummationoverlinksand“missing”links.Assumingthatthegraphisnotdominated by “missing” links, the computation of Eq. (3) scales linearly in the numberof graph links, |Y |. 1 We presentlyconsiderlatentbinaryfeaturesz , butwe notethatthe modelscales linearlyforany i parameterizationsofthelatentfeaturevectorzi,aslongasπij = 1−ez⊤iPzj ∈ [0;1]whichholds ingeneralifz isnon-negative. i As in existingmultiple membershipmodels[11, 12] we will assume an unboundednumberofla- tent features. We learn the effective number of features through the Indian buffet process (IBP) representation[6],whichdefinesadistributionoverunboundedbinarymatrices, αK K (N −m )!(m −1)! Z ∼IBP(α)∝ k k (5) Kh!Y N! h∈[Q0,1]N k=1 where mk is the numberof vertices belongingto class k and Kh is the numberof columnsof Z equaltoh. AsapriorovertheclasslinkprobabilitieswechooseindependentBetadistributions, ρkℓ|akℓ,bkℓ∼Beta(akℓ,bkℓ)∝ρkaℓkℓ−1(1−ρkℓ)bkℓ−1. (6) Thisisaconjugatepriorforthesinglemembershipmodelswheretheparametersa andb corre- kℓ kℓ spondtopseudocountsoflinksandnon-linksrespectivelybetweenclasseskandℓ. 2.1 Inference Inthefollowingwepresentamethodforinferringtheparametersofthemodel: theinfinitebinary featurematrixZandthelinkprobabilitiesρ .Inthelatentclassmodelwhenonlyasinglefeatureis kℓ activeforeachvertex,thelikelihoodinEq.(3)isconjugatetotheBetapriorforρ . Inthatcase,P kℓ canbeintegratedawayandacollapsedGibbssamplingprocedureforZcanbeused[9]. Thisisnot possibleinthe IMRM; instead,weproposetosampleP ∼ p(P|Z,Y)usingHamiltonianMarkov chainMonteCarlo(HMC),andZ ∼ p(Z|P,Y)usingGibbssamplingcombinedwithsplit-merge moves. HMCforclasslinkprobabilities: HamiltonianMarkovchainMonteCarlo(HMC)[5]isanauxil- iaryvariablesamplingprocedurethatutilizesthegradientofthelogposteriortoavoidtherandom walkbehaviorofothersamplingmethodssuchasMetropolis-Hastings. Inthefollowingwedonot describe the details of the HMC algorithm, but only derive the required expressionsfor the gradi- ent. ToutilizeHMC,thesampledvariablesmustbeunconstrained,butsinceρkℓ isaprobabilitywe make the followingchangeof variablefromρ ∈ [0,1] to r ∈ (−∞,∞), ρ = 1 , kℓ kℓ kℓ 1+exp(−rkℓ) r =−log ρ−1−1 . Usingthechangeofvariablestheorem,thepriorfortheclasslinkprobabil- kℓ (cid:0) kℓ (cid:1) itiesexpressedintermsofrkℓ isgivenbyp(rkℓ|akℓ,bkℓ)∝eakℓrkℓ(erkℓ +1)−(akℓ+bkℓ). Withthis, therelevanttermsofthenegativelogposteriorisgivenby −LP =logp(P|Z,Y)=c+Xlog(cid:16)1−ez⊤iPzj(cid:17)+Xz⊤iPzj+Xakℓrkℓ+(akℓ+bkℓ)log(erkℓ+1), (i,j)∈Y1 (i,j)∈Y0 kℓ (7) wherecdoesnotdependonP. Fromthis,therequiredgradientcanbecomputed, z⊤Pz ∂LP e i j =− z z ρ + z z ρ +(a +b )ρ −a . (8) ∂rkℓ X 1−ez⊤iPzj ik jℓ kℓ X ik jℓ kℓ kℓ kℓ kℓ kℓ (i,j)∈Y1 (i,j)∈Y0 4 Again,thepossiblylargesum overnon-linksin thesecondtermcanbe computedefficientlyasin Eq.(4). Gibbssamplerforbinaryfeatures: Following[6],aGibbssamplerforthelatentbinaryfeatures Z canbederived.Considersamplingthekthfeatureofvertexv : Ifoneormoreotherverticesalso i possessthefeature,i.e.,m = z >0,theposteriormarginalisgivenby −ik Pj6=i jk m p(z =1|Z ,P,Y)∝p(Y|Z,P) −ik. (9) ik −(ik) N Whenevaluatingthelikelihoodterm,onlythetermsthatdependonz needbecomputedandthe ik Gibbssamplercanbeimplementedefficientlybyreusingcomputationandbyupanddowndating variables. Inadditiontosamplingexistingfeatures,K(i) =Poisson α newfeaturesshouldalsobeassociated 1 (cid:0)N(cid:1) withv . [6]suggest“...computingprobabilitiesforarangeofvaluesofK(i)uptosomereasonable i 1 upperbound...”; however,following[11] we takeanotherapproachandsamplethe newfeatures byMetropolis-Hastingsusingthepriorasproposaldensity. Thevaluesofρ correspondingtothe kl newfeaturesareproposedfromthepriorinEq.(6). Split-merge move for binary features: A drawback of Gibbs sampling proceduresis that only a single variable is updated at a time, which makes the sampler prone to get stuck in suboptimal configurations. Asa remedy,bolderMetropolisHasting movescanbeconsideredinwhichmulti- ple changesof assignments help exploringalternative high probabilityconfigurations. How these alternativeconfigurationsare proposedis crucialin orderto attain reasonableacceptancerates. A popular approachis to split or merge existing classes as proposedin [8] for the Dirichlet process mixturemodel(DPMM). Split-mergesamplinginthe IBP haspreviouslybeendiscussedbrieflyby [11]and[12]. Inspiredbythenon-conjugatesequentialallocationsplit-mergesamplerfortheDPMM [4],wepro- posethefollowingprocedure: Drawtwonon-zeroelementsofZ,(k ,i )and(k ,i ). Ifk = k 1 1 2 2 1 2 proposeasplit—otherwiseproposetomergeclassesk andk intoajointclusterk . Acceptthe 1 2 1 proposalwiththeMetropolis-Hastingsacceptancerate,a∗ = min 1,p(Z∗,P∗|Y)q(Z|Z∗)q(P|P∗) . (cid:16) p(Z,P|Y)q(Z∗|Z)q(P∗|P) (cid:17) Incaseofamerge,weremovek andassignallitsverticestok , andweremovethecorrespond- 2 1 ing row and columnof P (this proposalis deterministic and hasprobabilityone). For a split, we removeall verticesexcepti fromcluster k = k = k andcreatea newcluster k∗ andassigni 1 1 2 2 toit. Wethensampleanewrowandcolumnρ∗ forthenewclusterasdescribedbelow. Nextwe k′ℓ′ sequentiallyallocate[4]theremainingoriginalmembersofktoeitherkork∗orbothinarestricted Gibbssamplingsweep,andrefinetheallocationthroughtadditionalrestrictedGibbsscans[8]. Theproposaldensityforρ∗k′ℓ′ isbasedonarandomwalk,ρ∗k′ℓ′ ∼Beta(a¯k′ℓ′,¯bk′ℓ′),where ¯bk′ℓ′ =max(cid:0)1,(1−ρ¯k′ℓ′)m2k−1+ρ¯k′ℓ′(cid:1), a¯k′ℓ′ =max(cid:16)1,1−ρ¯kρ′¯ℓk′′ℓ′¯bk′ℓ′(cid:17), (10) suchthatρ∗k′ℓ′ hasmeanρ¯k′ℓ′ andvarianceequaltotheempiricalvariance,ρ¯k′ℓ′(1−ρ¯k′ℓ′)/m2k.We ρ ℓ′ =k′ =k∗ kk  1 ρ k′ =k∗,ℓ′ =k choosethemeanoftherandomwalkasρ¯k′ℓ′ = K1−1Pℓ6=kρkℓ k′ =k,ℓ′ =k∗ suchthatthe K−1Pℓ6=k ℓk ρ otherwise, newclasshasasimilarwithinandbetweenclasslinkkpℓrobabilitiesastheoriginalclass,butsuchthat theclasslinkprobabilitybetweentheoriginalandnewclusterissimilartotheremainingbetween classlinkprobabilities. Thischoiceiscrucial,sinceitfavorssplittingclassesintotwoclassesthat arenomorerelatedthantherelationtotheremainingclasses. 3 Results BasedontheHW,DB,andRMparametrizationofρ,wecomparedourproposedIMRMtothecorre- spondingsingle-membershipIRM [9]. We evaluatedthemodelsonarangeofsyntheticallygener- atedaswellasrealworldnetworks. We assessedmodelperformanceintermsofabilitytopredict held-outlinksandnon-links. Asperformancemeasureweusedtheareaundercurve(AUC) ofthe 5 Y Z ρ Y Z ρ Figure 2: IRM (upper) and IMRM (lower) analysis of single (left) and multiple membership HW network(right).Onthesinglemembershipdata,bothmodelsfindthecorrectclassassignments.On themultiplemembershipdata,theIMRMfindsthecorrect10classes,whileIRMextracts25classes, whichthroughρaccountsforallcombinationsofclassespresentinthedata. HW DB RM MHW MDB MRM 1 0.8 0.6 0.4 CNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRMCNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRMCNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRMCNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRMCNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRMCNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRM Figure3: AUCscoresfortheanalysisofthesixsyntheticallygenerateddatasets. receiveroperatingcharacteristic(ROC). Wealsocomputedthepredictiveloglikelihood(notshown here)whichgavesimilarresults. Forcomparison,weincludedtheperformanceofseveralstandard non-parametriclinkpredictionapproachesbasedonthefollowingscores, y⊤y 1 γComN =y⊤y , γDegPr =k k , γJacc = i j , γShP = , i,j i j i,j i j i,j k +k −y⊤y i,j min {(Yp) >0} i j i j p i,j wherek = y isthedegreeofvertexv . i j ij i P Inalltheanalysesweremoved2.5%ofthelinksandanequivalentnumberofnon-linksforcross- validation. We analyzed a total of five random data splits and all of the analyses were based on 2500samplingiterationsinitialized randomlywith K = 50 clusters. Eachiterationwas basedon split-merge sampling using sequential allocation with t = 2 restricted Gibbs scans followed by standardGibbssampling. Ourimplementationofthe IRM wasbasedoncollapsedGibbssampling (i.e. integrating out ρ) as proposed in [9] but we also included a conjugate single-membership split-mergestepcorrespondingtotheproposednon-conjugatesplit-mergesampler. Thepriorswere chosenasα = log(N),a = 5,a = 1∀k 6= ℓandb = 1,b = 5∀k 6= ℓwhichrendersthe kk kℓ kk kℓ priorspracticallynon-informative. Syntheticnetworks: WeanalyzedatotalofsixsyntheticnetworksgeneratedaccordingtotheHW, DB and RM modelsbasedonthe verticeshavingeither oneortwo membershipsto theunderlying classes. ForthesinglemembershipmodelswegeneratedatotalofK = 5groupseachcontaining 100vertices. FortheHW generatednetworkwesetρc = 1andρ0 = 0whileforthe DBgenerated network we used a within communitydensities ρk rangingfrom 0.2 to 1 while ρ0 = 0. The RM generated network had same within community densities as the DB network but includedvarying degreesofoverlapbetweenthecommunities.ThemultiplemembershipmodelsdenotedMHW,MDB andMRMweregeneratedfromthecorrespondingsinglemembershipmodelsasY ∨RYR⊤(where ∨denoteselement-wiseorandRisarandompermutationmatrixwithdiagonalzero),suchthateach vertexbelongstotwoclasses. Whilethe IMRM modelexplicitlyaccountsformultiplememberships,the IRM modelcanalsoim- plicitlyaccountformultiplemembershipsthroughthebetweenclassinteractions. Toillustratethis, we analyzed the generated HW and MHW data by the IRM model as well as the proposed IMRM model (see Figure 2). When there are only single memberships, the IMRM reduces to the IRM model; however,whenthenetworkis generatedsuchthatthe verticeshavemultiplememberships the IMRM model correctly identifies the (2·5 = 10) underlying classes. The IRM model on the otherhandextractsalargernumberofclassescorrespondingtoallpossible(52 =25)combinations ofclassespresentinthedata. Theestimatedρindicateshowthese25classescombinetoformthe 6 Yeast USPower Erdos FreeAssoc Reuters911 1 0.8 0.6 0.4 CNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRM CNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRM CNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRM CNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRM CNOMJACDPEGSPHIHWIDBIRMIMHWIMDBIMRM Figure4: AUCscoresfortheanalysisofthefiverealnetworks. Table1:Summaryoftheanalyzedrealnetworks:rdenotesthenetworksassortativity,cthecluster- ingcoefficient[19],Ltheaverageshortestpath. NETWORK N |Y1| r c L Description Yeast 2,284 6,646 -0.10 0.13 4.4 Protein-proteininteractionnetwork[18] USPower 4,941 6,594 0.00 0.08 19.9 Topologyofpowergrid[19] Erdos 5,534 8,472 -0.04 0.08 3.9 Erdo¨s02collaborationnetwork[2] FreeAssoc 10,299 61,677 -0.07 0.12 3.9 Wordrelationsinfreeassociation[13] Reuters911 13,314 148,038 -0.11 0.37 3.1 Wordco-occurence[3] 10underlyingmultiplemembershipgroupsin the network. As such, the IRM modelhasthesame expressivepower as the proposedmultiple membershipmodelsbut interpretingthe results can be difficultwhenmultiplemembershipcommunitystructureissplitintoseveralclasseswithcomplex patternsofinteraction. Figure 3 shows the link-prediction AUC scores from the analysis of the six generated networks. Resultsshowthatallmodelsworkwellondatageneratedaccordingtotheirownmodelormodels whichtheygeneralize. Wealsonotice,thatthe IRM modelaccountswellformultiplemembership structureasdiscussedandillustratedinFigure2. TheHW andDB modelsontheotherhandfailin modelingnetworkswithmultiplememberships. Real Networks: We finally analyzedfive benchmarkcomplexnetworkssummarizedin Table 1. Thesizesofmostofthenetworksmakesitcomputationallyinfeasibleforustoanalyzethemusing theexistingmultiple-membershipapproachesproposedin[1,11,12]. Forallthenetworks,multiple membershipsare conceivable: In protein interaction networks such as the Yeast network proteins can be part of multiple functional groups, in social networks such as Erdos scientist collaborate withdifferentgroupsofpeopledependingontheresearchtopic,andinwordrelationnetworkssuch as Reuters911and FreeAssoc wordscan have multiple meanings/contexts. For all these networks explicitly modelingthese multiple contextscan potentially improveon the structure identification overtheequivalentsinglemembershipmodels. In Figure4 the AUC link predictionscoreis givenforthe five networksanalyzed. As can beseen fromthe results, modelingmultiplemembershipssignificantlyimproveson predictinglinksin the network. Inparticularwhenconsideringthe IHW and IDB modelsandthecorrespondingproposed multiple membership models, the learning of structure is improved substantially for all networks exceptUSPower. Furthermore,it can be seen thatthe IRM modelthat can also implicitly account formultiplemembershipsingeneralhasasimilarperformancetothemultiplemembershipmodels. ThepooridentificationofstructureintheUSPowernetworkmightbeduetothefactthattheaverage Table 2: Top table: The number of extracted components for the IRM and IMRM models. Bold denotes that the number of components are significantly different between the two models (i.e. differenceinmeanisatleasttwostandarddeviationsapart).Bottomtable:cpu-timeusageinhours for2500IRMandIMRMsamplingiterations. Yeast USPower Erdos FreeAssoc Reuters911 IRM 24.0±0.8 8.6±0.4 10.4±0.3 58.6±0.7 39.8±2.1 IMRM 15.4±0.9 6.8±0.5 6.8±0.6 15.6±0.9 44.8±1.0 Yeast USPower Erdos FreeAssoc Reuters911 IRM 2.3±0.1 4.0±0.2 14.6±5.9 30.1±0.6 32.5±5.4 IMRM 1.7±0.1 8.9±0.8 7.1±0.5 28.1±1.9 71.5±3.2 7 pathbetweenverticesareveryhighrenderingitdifficulttodetectthe underlyingstructureforany but the most simple IHW model. While the IRM and IMRM performequally well in terms of link predictionitcanbeseenintable2thattheaveragenumberofextractedcomponentsforthe IMRM modelis significantly smaller than the numberof componentsextracted by the IRM modelfor all networksexcepttheReuters911networkwherenosignificantdifferenceisfound. Asa result, the IMRM modelis in generalable to extracta more compactrepresentationof the latent structure of networks. In table 2 is given the total cpu-time for estimating the 2500 samples for each of the networkusingtheIRMandIMRMshowingthattheorderofmagnitudeforthecomputationalcostof thetwomodelsarethesame. 4 Discussion WhilesinglemembershipmodelsbasedontheIRMindirectlycanaccountformultiplememberships aswehaveshown,thebenefitoftheproposedframeworkisthatitallowsforthesemultiplemem- bershipstobemodeledexplicitlyratherthanthroughcomplexbetween-groupinteractionsbasedon a multitude of single membership components. On synthetic and real data we demonstrated that explicitlymodelingmultiple-membershipresultedinamorecompactrepresentationoftheinherent structureinnetworks. Wefurtherdemonstratedthatmodelsthatcancapturemultiplememberships (whichincludestheIRMmodel)significantlyimproveonthelinkpredictionrelativetomodelsthat canonlyaccountforsinglemembershipstructure,i.e.,theIHWandIDBmodels. Wepresentlycon- sideredundirectednetworksbutwenotethattheproposedapproachreadilygeneralizestodirected and bi-partite graphs. Furthermore, the approachalso extendsto include side informationas pro- posedin[12]aswellassimultaneousmodelingofvertexattributes[20]. Wenotehowever,thatthe inclusion of side informationrequires a linearly scalable parameterizationin order for the overall modeltoremaincomputationallyefficient. AnattractivepropertyoftheIRMmodelovertheIMRM modelisthattheIRMmodeladmitstheuseofcollapsedGibbssamplingwhichwehavefoundtobe moreefficientrelativetosamplingthenon-conjugatemultiplemembershipmodelswhereadditional sampling of the ρ parameter is required. In future research, we envision combining the IRM and IMRMmodel,usingtheIRMasinitializationfortheIMRMorbyforminghybridmodels. References [1] E.M.Airoldi,D.M.Blei,S.E.Fienberg,andE.P.Xing.Mixedmembershipstochasticblockmodels.J.Mach.Learn.Res.,9:1981–2014, 2008. [2] V.BatageljandA.Mrvar.Pajekdatasets,2006. [3] S.R.Corman,T.Kuhn,R.D.McPhee,andD.K.J.Studyingcomplexdiscursivesystems:Centeringresonanceanalysisofcommunica- tion.Humancommunicationresearch,28(2):157–206,2002. [4] D.B.Dahl. Sequentially-allocatedmerge-splitsamplerforconjugateandnonconjugateDirichletprocessmixturemodels. Technical report,TexasA&MUniversity,2005. [5] S.Duane,A.D.Kennedy,B.J.Pendleton,andD.Roweth.Hybridmontecarlo.PhysicsLettersB,195(2):216–222,1987. [6] T.L.GriffithsandZ.Ghahramani.InfinitelatentfeaturemodelsandtheIndianbuffetprocess.InNeuralInformationProcessingSystems, Advancesin(NIPS),pages475–482,2006. [7] J.M.HofmanandC.H.Wiggins.Bayesianapproachtonetworkmodularity.PhysicalReviewLetters,100(25),Jun2008. [8] S.JainandR.M.Neal.Asplit-mergemarkovchainmontecarloprocedureforthedirichletprocessmixturemodel.JournalofCompu- tationalandGraphicalStatistics,13(1):158–182,2004. [9] C.Kemp,J.B.Tenenbaum,T.L.Griffiths,T.Yamada,andN.Ueda.Learningsystemsofconceptswithaninfiniterelationalmodel.In ArtificialIntelligence,ProceedingsoftheNationalAAAIConferenceon,2006. [10] D.D.LeeandH.S.Seung.Learningthepartsofobjectsbynon-negativematrixfactorization.Nature,401(6755):788–791,1999. [11] E.Meeds,Z.Ghahramani,R.M.Neal,andS.T.Roweis. Modelingdyadicdatawithbinarylatentfactors. InAdvancesinNeural InformationProcessingSystems(NIPS),volume19,pages977–984,2007. [12] K.T.Miller,T.L.Griffiths,andM.I.Jordan.Nonparametriclatentfeaturemodelsforlinkprediction.InAdvancesinNeuralInformation ProcessingSystems(NIPS),pages1276–1284,2009. [13] D.L.Nelson,C.L.McEvoy,andT.A.Schreiber. Theuniversityofsouthfloridawordassociation,rhyme,andwordfragmentnorms, 1998. [14] M.E.J.Newman.Thestructureofscientificcollaborationnetworks.PNAS,98(2):404–409,2001. [15] M.E.J.NewmanandM.Girvan.Findingandevaluatingcommunitystructureinnetworks.Phys.Rev.E,69(2):026113–1–15.,2004. [16] K.NowickiandT.A.B.Snijders.Estimationandpredictionforstochasticblockstructures.JournaloftheAmericanStatisticalAssocia- tion,96(455):1077–1087,2001. [17] J.ReichardtandS.Bornholdt.Statisticalmechanicsofcommunitydetection.PhysicalReviewE,74(1),2006. [18] S.Sun,L.Ling,N.Zhang,G.Li,andR.Chen.Topologicalstructureanalysisoftheprotein-proteininteractionnetworkinbuddingyeast. NucleicAcidsResearch,31(9):2443–2450,2003. [19] D.J.WattsandS.H.Strogatz.Collectivedynamicsof’small-world’networks.Nature,393(6684):440–442,June1998. [20] Z.Xu,V.Tresp,KYu,andH.-P.Kriegel. Learninginfinitehiddenrelationalmodels.UncertainityinArtificialIntelligence(UAI2006), 2006. 8

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.