Heterogeneous Information Network Embedding for Meta Path based Proximity Zhipeng Huang, Nikos Mamoulis TheUniversityofHongKong {zphuang, nikos}@cs.hku.hk ABSTRACT 7 Anetworkembeddingisarepresentationofalargegraphinalow- v1 t1 v2 t2 v3 1 dimensional space, where vertices are modeled as vectors. The WWW Embed KDD Mining ICDM 0 objective of a good embedding is to preserve the proximity (i.e., 2 similarity)betweenverticesintheoriginalgraph. Thisway, typ- n ical search and mining methods (e.g., similarity search, kNN re- WWW15 KDD16 ICDM16 a trieval, classification, clustering) can be applied in the embedded J spacewiththehelpofoff-the-shelfmultidimensionalindexingap- p1 p2 p3 9 proaches.Existingnetworkembeddingtechniquesfocusonhomo- 1 geneous networks, where all vertices are considered to belong to a a 1 2 a single class. Therefore, they are weak in supporting similarity ] measures for heterogeneous networks. In this paper, we present AI aneffectiveheterogeneousnetworkembeddingapproachformeta Object Types: Author KDD15 Paper Embed Topic KDD Venue pathbasedproximitymeasures. Wedefineanobjectivefunction, Edge Types: Write Publish Mention Cite . s whichaimsatminimizingthedistancebetweentwodistributions, c onemodelingthemetapathbasedproximities,theothermodeling [ Figure1:AnexampleofHIN theproximitiesintheembeddedvectorspace. Wealsoinvestigate 1 theuseofnegativesamplingtoacceleratetheoptimizationprocess. v As shown in our extensive experimental evaluation, our method nalgraph. Searchandanalysiscanthenbeappliedontheembed- 1 createsembeddingsofhighqualityandhassuperiorperformance dingwiththehelpofefficientalgorithmsandindexingapproaches 9 inseveraldataminingtaskscomparedtostate-of-the-artnetwork forvectorspaces. 2 embeddingmethods. Heterogeneousinformationnetworks(HINs),suchasDBLP[15], 5 YAGO[26],DBpedia[2]andFreebase[4],arenetworkswithnodes 0 Keywords and edges that may belong to multiple types. These graph data . 1 heterogeneous information network; meta path; network embed- sourcescontainavastnumberofinterrelatedfacts,andtheycanfa- 0 ding cilitatethediscoveryofinterestingknowledge[11,14,17,20].Fig- 7 ure1illustratesaHIN,whichdescribestherelationshipsbetween 1 1. INTRODUCTION objects(graphvertices)ofdifferenttypes(e.g.,author,paper,venue v: Theavailabilityandgrowthoflargenetworks,suchassocialnet- andtopic).Forexample,JiaweiHan(a1)haswrittenaWWWpaper Xi works,co-authornetworks,andknowledgebasegraphs,hasgiven (p1C),owmhpiacrhedmteonhtioomnsotgoepnieco“uEsmnbeetdw”o(rtk1s),.therelationshipsbetween risetonumerousapplicationsthatsearchandanalyzeinformation r objectsinaHINaremuchmorecomplex. Theproximityamong a inthem. However,forverylargegraphs,commoninformationre- objectsinaHINisnotjustameasureofclosenessordistance,but trievalandminingtaskssuchaslinkprediction,nodeclassification, itisalsobasedonsemantics. Forexample,intheHINinFigure1, clustering, and recommendation are time-consuming. This moti- authora isclosetobotha andv , buttheserelationshipshave vatedalotofinterest[6,22,31]inapproachesthatembedthenet- 1 2 1 different semantics. a is a co-author of a , while v is a venue workintoalow-dimensionalspace,suchthattheoriginalvertices 2 1 1 wherea hasapaperpublished. of the graph are represented as vectors. A good embedding pre- 1 Metapath[29]isarecentlyproposedproximitymodelinHINs. servestheproximity(i.e.,similarity)betweenverticesintheorigi- Ametapathisasequenceofobjecttypeswithedgetypesinbe- tweenmodelingaparticularrelationship.Forexample,A→P → V → P → Aisa metapath, whichstates thattwoauthors (A) arerelatedbytheirpublicationsinthesamevenue(V). Basedon metapaths, severalproximitymeasureshavebeenproposed. For example,PathCount[29]countsthenumberofmetapathinstances connectingthetwoobjects,whilePathConstrainedRandomWalk (PCRW) [14] measures the probability that a random walk start- ing from one object would reach the other via a meta path in- ACMISBN978-1-4503-2138-9. DOI:10.1145/1235 stance. These measures have been shown to have better perfor- mancecomparedtoproximitymeasuresnotbasedonmetapaths, someexamplesasseeds. TheproblemoflinkpredictiononHINs invariousimportanttasks,suchask-NNsearch[29],linkpredic- hasbeenextensivelystudied[27,28,35], duetoitsimportantap- tion[27,28,35], recommendation[34], classification[12,13]and plications (e.g., in recommender systems). A related problem is clustering[30]. entityrecommendationinHINs[34],whichtakesadvantageofthe Although there are a few works on embedding HINs [7,21], different types of relationships in HINs to provide better recom- none of them is designed for meta path based proximity in gen- mendations. eralHINs. Tofillthisgap,inthispaper,weproposeHINE,which learns a transformation of the objects (i.e., vertices) in a HIN to 2.2 MetaPathandProximityMeasures a low-dimensional space, such that the meta path based proximi- Metapath[29]isageneralmodelfortheproximitybetweenob- ties between objects are preserved. More specifically, we define jectsinaHIN.Severalmeasureshavebeenproposedfortheprox- anappropriateobjectivefunctionthatpreservestherelativeprox- imitybetweenobjectsw.r.t. agivenmetapathP. PathCountmea- imitybasedrankingstheverticesintheoriginalandtheembedded suresthenumberofmetapathinstancesconnectingthetwoobjects, space. Asshownin[29],metapathswithtoolargelengthsarenot and PathSim is a normalized version of it [29]. Path constrained very informative; therefore, we only consider meta paths up to a randomwalk(PCRW)wasfirstlyproposed[14]forthetaskofre- given length threshold l. We also investigate the use of negative lationship retrieval over bibliographic networks. Later, [17] pro- sampling[19]inordertoacceleratetheoptimizationprocess. posedanautomaticapproachtolearnthebestcombinationofmeta WeconductextensiveexperimentsonfourrealHINdatasetsto paths and their corresponding weights based on PCRW. Finally, compareourproposedHINEmethodwithstate-of-the-artnetwork HeteSim [25] is recently proposed as an extension of meta path embeddingmethods(i.e.,LINE[31]andDeepWalk[22]),whichdo basedSimRank. Inthispaper,wefocusonthetwomostpopular notconsidermetapathbasedproximity. Ourexperimentalresults proximitymeasures,i.e.,PathCountandPCRW. show that our HINE method with PCRW as the meta path based proximitymeasureoutperformsallalternativeapproachesinmost 2.3 NetworkEmbedding ofthequalitativemeasuresused. Networkembeddingaimsatlearninglow-dimensionalrepresen- Thecontributionsofourpaperaresummarizedasfollows: tationsfortheverticesofanetwork,suchthattheproximitiesamong • This is the first work that studies the embedding of HINs them in the original space are preserved in the low-dimensional forthepreservationofmetapathbasedproximities. Thisis space.Traditionaldimensionalityreductiontechniques[3,8,24,32] animportantsubject,becausemetapathbasedproximityhas typicallyconstructtheaffinitygraphusingthefeaturevectorsofthe beenprovedtobemoreeffectivethantraditionalstructured vertexes and then compute the eigenvectors of the affinity graph. basedproximities. Graph factorization [1] finds a low-dimensional representation of agraphthroughmatrixfactorization, afterrepresentingthegraph • We define an objective function, which explicitly aims at asanadjacencymatrix. However, sincethesegeneraltechniques minimizingthethedistancebetweentwoprobabilitydistri- arenotdesignedfornetworks,theydonotnecessarilypreservethe butions, one modeling the meta path based proximities be- globalnetworkstructure,aspointedoutin[31]. tweentheobjects,theothermodelingtheproximitiesinthe Recently, DeepWalk [22] is proposed as a method for learn- low-dimensionalspace. ing the latent representations of the nodes of a social network, from truncated random walks in the network. DeepWalk com- • Weinvestigatetheuseofnegativesamplingtoacceleratethe binesrandomwalkproximitywiththeSkipGrammodel[18],alan- processofmodeloptimization. guagemodelthatmaximizestheco-occurrenceprobabilityamong the words that appear within a window in a sentence. However, • WeconductextensiveexperimentsonfourrealHINs,which DeepWalkhascertainweaknesseswhenappliedtoourproblemset- demonstratethatourHINEmethodisthemosteffectiveap- tings.First,therandomwalkproximityitadoptsdoesnotconsider proach for preserving the information of the original net- the heterogeneity of a HIN. In this paper, we use PCRW, which works;HINEoutperformsthestate-of-the-artembeddingmeth- extendsrandomwalkbasedproximitytobeappliedforHINs.Sec- odsinmanydataminingtasks,e.g.,classification,clustering ond, aspointedoutin[31], DeepWalkcanonlypreservesecond- andvisualization. orderproximity,leadingtopoorperformanceinsometasks,suchas linkrecoverandclassification,whichrequirefirst-orderproximity The rest of this paper is organized as follows. Section 2 dis- tobewell-preserved. cussesrelatedwork.InSection3,weformallydefinetheproblem. LINE[31]isarecentlyproposedembeddingapproachforlarge- Section4describesourHINEmethod. InSection5,wereportour scalenetworks. Althoughitusesanexplicitobjectivefunctionto experimentalresults.Section6concludesthepaper. preserve the network structure, its performance suffers from the way it learns the vector representations. By design, LINE learns 2. RELATEDWORK tworepresentationsseparately,onepreservingfirst-orderproximity andtheotherpreservingsecond-orderproximity. Then,itdirectly 2.1 HeterogeneousInformationNetworks concatenates the two vectors to form the final representation. In TheheterogeneityofnodesandedgesinHINsbringchallenges, thispaper,weproposeanetworkembeddingmethod,whichdoes but also opportunities to support important applications. Lately, not distinguish the learning of first and second-order proximities therehasbeenanincreasinginterestinbothacademiaandindustry andembedsproximitiesofallorderssimultaneously. intheeffectivesearchandanalysisofinformationfromHINs.The GraRep[6]furtherextendsDeepWalktoutilizehigh-orderprox- problemofclassifyingobjectsinaHINbyauthoritypropagationis imities.GraRepdoesnotscalewellinlargenetworksduetotheex- studiedin[12]. Follow-upwork[13]investigatesacollectiveclas- pensivecomputationofthepowerofamatrixandtheinvolvement sification problem in HINs using meta path based dependencies. ofSVDinthelearningprocess. SDNE[33]isasemi-supervised PathSelClus[30]isalinkbasedclusteringalgorithmforHINs,in deepmodelthatcapturesthenon-linearstructuralinformationover which a user can specify her clustering preference by providing thenetwork.ThesourcecodeofSDNEisnotavailable,sothisap- Forexample,ametapathA −w−r−it→e P −w−r−it−e−−→1 Arepresentsthe A A D coauthorrelationshipbetweentwoauthors. AninstanceofthemetapathPisapathintheHIN,whichcon- write actIn direct formstothepatternofP. Forexample, patha → p → a in V publish P mention T P produce M compose C 1 3 2 cite Figure1isaninstanceofmetapathA−w−r−it→e P −w−r−it−e−−→1 A. DEFINITION 3. (MetaPathbasedProximity)GivenaHING= (a) DBLP (b) MOVIE (V,E)andametapathP,theproximityoftwonodeso ,o ∈ V s t withrespecttoPisdefinedas: Figure2:Schemasoftwoheterogeneousnetworks (cid:88) s(o ,o |P)= s(o ,o |p ) (1) s t s t os→ot pos→ot∈P proachcannotbereproducedandcomparedtoours. Similarly,[5] where p is a meta path instance of P linking from o to embedsentitiesinknowledgebasesusinganinnovativeneuralnet- os→ot s o ,ands(o ,o |p )isaproximityscorew.r.t. theinstance workarchitectureandTriDNR[21]extendsthisembeddingmodel t s t os→ot p . toconsiderfeaturesfromthreeaspectsofthenetwork: 1)network os→ot structure, 2) node content, and 3) label information. Our current Therearedifferentwaystodefines(o ,o |p )intheliter- workfocusesonmetapathbasedproximities, soitdoesnotcon- s t os→ot ature.Here,weonlylisttwodefinitionsthatweuseastestcasesin sideranyotherinformationintheHINbesidesthenetworkstruc- ourexperiments. tureandthetypesofnodesandedges. • AccordingtothePathCount(PC)model,eachmetapathin- 3. PROBLEMDEFINITION stanceisequallyimportantandshouldbegivenequalweight. Inthissection,weformulatetheproblemofheterogeneousinfor- Hence, s(os,ot | pos→ot) = 1 and the proximity between mationnetwork(HIN)embeddingformetapathbasedproximities. two objects w.r.t. P equals the number of instances of P WefirstdefineaHINasfollows: connectingthem,i.e.,s(os,ot|P)=|{pos→ot ∈P}|.1 DEFINITION 1. (HeterogeneousInformationNetwork)Ahet- • PathConstrainedRandomWalk(PCRW)[14]isamore erogeneous information network (HIN) is a directed graph G = sophisticatedwaytodefinetheproximitys(os,ot|pos→ot) (V,E) with an object type mapping function φ : V → L and a basedonaninstancepos→ot. Accordingtothisdefinition, linktypemappingfunctionψ :E →R,whereeachobjectv ∈V s(os,ot |pos→ot)istheprobabilitythatarandomwalkre- belongstoanobjecttypeφ(v) ∈ L,andeachlinke ∈ Ebelongs strictedonPwouldfollowtheinstancepos→ot. toalinktypeψ(e)∈R. Note that the embedding technique we introduce in this paper Figure1illustratesasmallbibliographicHIN.Wecanseethat canalsobeeasilyadaptedtoothermetapathbasedproximitymea- theHINcontainstwoauthors,threepapers,threevenues,twotop- sures,e.g.,PathSim[29],HeteSim[25]andBPCRW[17]. ics,andfourtypesoflinksconnectingthem.Fortheeaseofdiscus- sion,weassumethateachobjectbelongstoasingletype;however, DEFINITION 4. (Proximity in HIN) For each pair of objects o ,o ∈V,theproximitybetweeno ando inGisdefinedas: ourtechniquecaneasilybeadaptedtothecasewhereobjectsbe- s t s t longtomultipletypes. (cid:88) s(o ,o )= s(o ,o |P) (2) s t s t DEFINITION 2. HINSchema. GivenaHING = (V,E)with P mappingsφ : V → Landψ : E → R,theschemaTG ofGis wherePissomemetapath,ands(os,ot|P)istheproximityw.r.t. adirectedgraphdefinedoverobjecttypesLandlinktypesR,i.e., themetapathPasdefinedinDefinition3. T =(L,R). G AccordingtoDefinition4, theproximityoftwoobjectsequals TheHINschemaexpressesallallowablelinktypesbetweenob- thesumoftheproximitiesw.r.t.allmetapaths.Intuitively,thiscan jecttypes. Figure2(a)showstheschemaoftheHINinFigure1, captureallkindsofrelationshipsbetweenthetwoobjects.Wenow where nodes A, P, T and V correspond to author, paper, topic provideadefinitionfortheproblemthatwestudyinthispaper. and venue, respectively. There are also different edge types in theschema, suchas‘publish’and‘write’. Figure2(b)showsthe DEFINITION 5. (HINEmbeddingforMetaPathbasedProx- schemaofamovie-relatedHIN,withM,A,D,P,Crepresenting imity)GivenaHING=(V,E),developamappingf :V →Rd movie,actor,director,producerandcomposer,respectively. that transforms each object o ∈ V to a vector in Rd, such that InaHING,twoobjectso ,o maybeconnectedviamultiple the proximities between any two objects in the original HIN are 1 2 edgesorpaths. Conceptually,eachofthesepathsrepresentsaspe- preservedinRd. cificdirectorcompositerelationshipbetweenthem. InFigure1, authorsa anda areconnectedviamultiplepaths. Forexample, 4. HINE 1 2 a → p → a meansthata anda arecoauthorsofpaperp , 1 3 2 1 2 3 In this section, we introduce our methodology for embedding anda →p →p →a meansthata ’spaperp citesa ’spaper 1 1 2 2 2 2 1 HINs. We first discuss how to calculate the truncated estimation p . Thesetwopathsrepresenttwodifferentrelationshipsbetween 1 ofmetapathbasedproximityinSection4.1. Then, weintroduce authora anda . 1 2 ourmodelanddefinetheobjectivefunctionwewanttooptimizein Eachdifferenttypeofarelationshipismodeledbyametapath Section4.2. Finally,wepresentanegativesamplingapproachthat [29]. Specifically, a meta path P is a sequence of object types acceleratestheoptimizationoftheobjectivefunctioninSection4.3. l ,...,l connectedbylinktypesr ,...,r asfollows: 1 n 1 n−1 1Fortheeaseofpresentation,weoverloadnotationPtoalsodenote P =l −r→1 l ...l −r−n−−→1 l . thesetofinstancesofmetapathP. 1 2 n−1 n Table1:Metapathsandinstancesconnectinga anda 1 2 Algorithm1:CalculateTruncatedProximity Len2gth APPA a1→posp→3o→t a2 sP1C sP0C.2R5W IOnuptuptu:t:HTINruGnca=ted(VP,rEox)i,mleitnygMthatthrriexsShˆolldl 3 APPA a1→p1→p2→a2 1 0.5 1 Sˆ0←∅ 4 APTPA aa11→→pp13→→tt12→→pp23→→aa22 11 00..2255 32 foroSˆs0∈[osV,odso]←1.0 APVPA a1→p3→v3→p3→a2 1 0.25 4 fork∈[1···l]do ··· ··· ··· ··· ··· 5 Sˆk←∅ 6 foros∈V do 7 foro(cid:48)∈neighbor(os)do 4.1 TruncatedProximityCalculation 8 for(o(cid:48),ot)∈Sˆk−1do AccordingtoDefinition4,inordertocomputetheproximityof 9 Sˆk[os,ot]← twoobjectso ,o ∈ V,weneedtoaccumulatethecorresponding Sˆk[os,ot]+inc(os,o(cid:48),Sˆk−1[o(cid:48),ot]) metapathbasiedjproximityw.r.t. eachmetapathP. Forexample, 10 returnSˆl; inTable1,welistsomemetapathsthathaveinstancesconnecting a anda inFigure1. Wecanseethatthereisonelength-2meta 1 2 path,onelength-3metapathandtwolength-4metapathsthathave for PathCount, in which case we set inc(o ,o(cid:48),Sˆ [o(cid:48),o ]) = s k−1 t instancesconnectinga1anda2.Generallyspeaking,thenumberof Sˆk−1[o(cid:48),ot]. possiblemetapathsgrowsexponentiallywiththeirlengthandcan We now provide a time complexity analysis for computing the beinfiniteforcertainHINschema(inthiscase,thecomputationof truncatedproximitiesonaHINusingAlgorithm1.Foreachobject s(oi,oj)isinfeasible). os ∈ V,weneedtoenumerateallthemetapathinstanceswithin As pointed out in [29], shorter meta paths are more informa- lengthl. SupposetheaveragedegreeintheHINisD;then,there tivethanlongerones,becauselongermetapathslinkmoreremote areonaveragelDsuchinstances.Hence,thetotaltimecomplexity objects (which are less related semantically). Therefore, we use forproximitycalculationisO(|V|·lD),whichislineartothesize a truncated estimation of proximity, which only considers meta oftheHIN. pathsuptoalengththresholdl. Thisisalsoconsistentwithpre- viousworksonnetworkembedding, whichaimatreservinglow- 4.2 Model order proximity (e.g., [22] reserves only second-order proximity, We now introduce our HINE embedding method, which pre- and[31]first-orderandsecond-orderproximities). Therefore,we servesthemetapathbasedproximitiesbetweenobjectsasdescribed define: above.Foreachpairofobjectso ando ,weuseSigmoidfunction i j sˆ(o ,o )= (cid:88) s(o ,o |P) (3) todefinetheirjointprobability,i.e., l i j i j len(P)≤l 1 p(o ,o )= (5) asthetruncatedproximitybetweentwoobjectso ando . i j 1+e−vi·vj i j ForthecaseofPCRWbasedproximity, wehavethefollowing where v (or v ) ∈ Rd is the low-dimensional representation of property: i j objecto (oro ). p : V2 → Risaprobabilitydistributionovera i j PROPERTY 1. pairofobjectsintheoriginalHIN. IntheoriginalHING,theempiricaljointprobabilityofo and sˆl(oi,oj)= (cid:88) pψoi(→oio,o(cid:48)(cid:48))×sˆl−1(o(cid:48),oj) oj canbedefinedas: i (oi,o(cid:48))∈E s(o ,o ) wherepψoi(→oio,o(cid:48)(cid:48))isthetransitionprobabilityfromoitoo(cid:48)w.r.t.edge pˆ(oi,oj)= (cid:80)o(cid:48)∈Vis(oji,o(cid:48)) (6) typeψ(o ,o(cid:48)).Iftherearenedgesfromo thatbelongtoedgetype i i Topreservethemetapathbasedproximitys(·,·),anaturalob- ψ(o ,o(cid:48)),thenψ(o ,o(cid:48))= 1. i i n jectiveistominimizethedistanceofthesetwoprobabilitydistri- PROOF. Assumethatp=oi →o(cid:48) →···→oj isametapath butions: instanceoflengthl.AccordingtodefinitionofPCRW: O=dist(pˆ,p) (7) s(oi,oj |p[1:l])=pψoi(→oio,o(cid:48)(cid:48))×s(o(cid:48),oj |p[2:l]) (4) Inthispaper, wechooseKL-divergenceasthedistancemetric, sowehave: wherep[i:j]isthesubsequenceofpathinstancepfromthei-thto (cid:88) thej-thobjects,andp[2 :l]isalengthl−1pathinstance. Then, O=− s(oi,oj)logp(oi,oj) (8) bysummingoverallthelength-lpathinstancesforEquation4,we oi,oj∈V getProperty1. 4.3 NegativeSampling Based on Property 1, we develop a dynamic programming ap- DirectlyoptimizingtheobjectivefunctioninEquation8isprob- proach(Algorithm1)tocalculatethetruncatedproximities. Basi- lematic. Firstofall,thereisatrivialsolution: v = ∞foralli cally, we compute the proximity matrix Sˆk for each k from 1 to andalld. Second,itiscomputationallyexpensivie,dtocalculatethe l. We first initialize the proximity matrix Sˆ0 (lines 1-3). Then, gradient,asweneedtosumoverallthenon-zeroproximityscores weusethetransitionfunctioninProperty1toupdatetheproxim- s(o ,o ) for a specific object o . To address these problems, we i j i ity matrix for each k (lines 4-9). If we use PCRW as the meta adoptnegativesamplingproposedin[19],whichbasicallysamples path based proximity measure, inc(os,o(cid:48),Sˆk−1[o(cid:48),ot]) in line 9 asmallnumberofnegativeobjectstoenhancetheinfluenceofpos- equals pψ(os,o(cid:48)) ×Sˆ [o(cid:48),o ]. The algorithm can also be used itiveobjects. os→o(cid:48) k−1 t Table2:StatisticsofDatasets C D |V| |E| Avg.degree |L| |R| issue develop DBLP 15,649 51,377 6.57 4 4 R towards V mention K P publish G design S MOVIE 25,643 40,173 3.13 5 4 YELP 37,342 178,519 9.56 4 3 GAME 6,909 7,754 2.24 4 3 (a) YELP (b) GAME Figure3:SchemasofTwoHINs • YELP.WeextractedaHINfromYELP,whichisaboutre- views given to restaurants. It contains 33,360 reviews (V), 1,286customers(C),82food-relatedkeywords(K)and2,614 restaurants(R).Weonlyextractrestaurantsfromoneofthe Formally,wedefineanobjectivefunctionforeachpairofobjects followingtypes:fastfood,American,sushibar.Theschema withnon-zerometapathbasedproximitys(o ,o ): i j oftheHINisshowninFigure3(a). K • GAME. We extracted from Freebase [4] a HIN, which is T(oi,oj)=−log(1+e−vivj)−(cid:88)Ev(cid:48)∈Pn(oi)[log(1+eviv(cid:48))] related to video games. The HIN consists of 4,095 games 1 (G),1,578publishers(P),2,043developers(D)and197de- (9) signers(S).Allthegameobjectsextractedareofoneofthe whereK isthetimesofsampling,andPn(v)issomenoisedistri- threegenres:action,adventure,andstrategy. Theschemais bution. Assuggestedby[19],wesetPn(v) ∝ dout(v)3/4 where showninFigure3(b). d (v)istheout-degreeofv. out Weadopttheasynchronousstochasticgradientdescent(ASGD) Competitors. Wecomparethefollowingnetworkembeddingap- algorithm [23] to optimize the objective function in Equation 9. proaches: Specifically, in each round of the optimization, we sample some pairsofobjectso ando withnon-zeroproximitys(o ,o ).Then, • DeepWalk [22] is a recently proposed social network em- i j i j thegradientw.r.t.v canbecalculatedas: beddingmethod(seeSection2.3fordetails). Inourexperi- i mentsettings,weignoretheheterogeneityanddirectlyfeed the HINs for embedding. We use default training settings, ∂O e−vivj =−s(o ,o )· ·v (10) i.e.,windowsizew=5andlengthofrandomwalkt=40. ∂vi i j 1+e−vivj j • LINE[31]isamethodthatpreservesfirst-orderandsecond- 5. EXPERIMENTS order proximities between vertices (see Section 2.3 for de- tails). For each object, it computes two vectors; one for In this section, we conduct extensive experiments in order to thefirst-orderandoneforthesecond-orderproximitiessep- test the effectiveness of our proposed HIN embedding approach. aratelyandthenconcatenatesthem. Weuseequalrepresen- Wefirstintroduceourdatasetsandthesetofmethodstobecom- tation lengths for the first-order and second-order proxim- pared in Section 5.1. Then, we evaluate the effectiveness of all ities, and use the default training settings; i.e., number of approachesonfiveimportantdataminingtasks: networkrecovery negative samplings. K = 5, starting value of the learning (Section5.2),classification(Section5.3),clustering(Section5.4), rate ρ = 0.025, and total number of training samplings k-NNsearch(Section5.5)andvisualization(Section5.6).Inaddi- 0 T =100M. SameasDeepWalk,wedirectlyfeedtheHINs tion,weconductacasestudy(Section5.7)tocomparethequality forembedding. oftop-klists. WeevaluatetheinfluenceofparameterlinSection 5.8. Finally,weassesstheruntimecostofapplyingourproposed • HINE_PCisourHINEmodelusingPathCountasthemeta transformationinSection5.9. pathbasedproximity. Weimplementedtheprocessofprox- 5.1 DatasetandConfigurations imitycomputinginC++ona16GBmemorymachinewith Intel(R)Core(TM)[email protected],we usel=2. InSection5.8,westudytheinfluenceofparame- Datasets. We use four real datasets in our evaluation. Table 2 terl. showssomestatisticsaboutthem. • HINE_PCRWisourHINEmodelusingPCRWasthemeta • DBLP. The schema of DBLP network is shown in Figure path based proximity. All the other configurations are the 2(a). We use a subset of DBLP, i.e., DBLP-4-Area taken sameasthoseofHINE_PC. of[29],whichcontains5,237papers(P),5,915authors(A), 18venues(V),4,479topics(T).Theauthorsarefrom4ar- Unlessotherwisestated, thedimensionalitydoftheembedded eas: database,datamining,machinelearningandinforma- vectorspaceequals10. tionretrieval. 5.2 NetworkRecovery • MOVIE.WeextractedasubsetfromYAGO[10],whichcon- Wefirstcomparetheeffectivenessofdifferentnetworkembed- tainsknowledgeaboutmovies. TheschemaofMOVIEnet- ding methods at a task of link recovery. For each type of links workisshowninFigure2(b). Itcontains7,332movies(M), (i.e.,edges)intheHINschema,weenumerateallpairsofobjects 10,789actors(A),1,741directors(D),3,392producers(P) (o ,o ) that can be connected by such a link and calculate their s t and1,483composers(C).Themoviesaredividedintofive proximityinthelow-dimensionalspaceafterembeddingo tov s s genres:action,horror,adventure,sci-fiandcrime. and o to v . Finally, we use the area under ROC-curve (AUC) t t toevaluatetheperformanceofeachembedding. Forexample,for edge type write, we enumerate all pairs of authors a and papers i p inDBLPandcomputetheproximityforeachpair. Finally,us- j ingtherealDBLPnetworkasground-truth,wecomputetheAUC valueforeachembeddingmethod. The results for d = 10 are shown in Table 3. Observe that, ingeneral, HINE_PCandHINE_PCRWhavebetterperformance comparedtoLINEandDeepWalk. Inordertoanalyzethereasons behindthebadperformanceofLINE,wealsoincludedtwospecial versions of LINE: LINE_1st (LINE_2nd) is a simple optimiza- (a) LINE (b) DeepWalk tionapproachthatjustusesstochasticgradientdescenttooptimize just the first-order (second-order) proximity among the vertices. LINE_1st has much better performance than LINE_2nd because second-orderproximitypreservationdoesnotfacilitatelinkpredic- tion. LINE,whichconcatenatesLINE_1standLINE_2ndvectors, hasworseperformancethanLINE_1stbecauseitsLINE_2ndvec- torcomponentharmslinkpredictionaccuracy. Thisisconsistent with our expectation, that training proximities of multiple orders simultaneouslyisbetterthantrainingthemseparately. Amongall methods,HINE_PCRWhasthebestperformanceinpreservingthe (c) HINE_PC (d) HINE_PCRW links of HINs; in the cases where HINE_PCRW loses by other methods, its AUC is very close to them. HINE_PCRW outper- Figure6:VisualizationResultsonDBLP formsHINE_PC,whichisconsistentwithresultsinpreviouswork that find PCRW superior to PC (e.g., [10]). The rationale is that a random walk models proximity as probability, which naturally Table 5 shows the results for d = 10. We can see that all weighsnearbyobjectshighercomparedtoremoteones. methodshavebetterresultsonDBLPandYELP,thanonMOVIE 5.3 Classification and GAME. This is because the average degree on MOVIE and GAME is smaller, and the HINs are sparser (See Table 2). Ob- servethatHINE_PCandHINE_PCRWoutperformDeepWalkand Table5:ResultsofClassificationwithd=10 LINE in DBLP and MOVIE; HINE_PCRW performs the best in LINE DeepWalk HINE_PC HINE_PCRW both datasets. On YELP, LINE has slightly better results than HINE_PCRW (about 1%), while DeepWalk has a very poor per- Macro-F1 0.816 0.839 0.844 0.868 DBLP Micro-F1 0.817 0.840 0.844 0.869 formance. OnGAME,DeepWalkisthewinner, whileLINEand Macro-F1 0.324 0.280 0.346 0.365 HINE_PCRWperformsimilarly.Overall,wecanseethatHINE_PCRW MOVIE Micro-F1 0.369 0.328 0.387 0.406 hasthebest(orclosetothebest)performance,andLINEhasquite Macro-F1 0.898 0.339 0.470 0.886 goodperformance.Ontheotherhand,DeepWalk’sperformanceis YELP Micro-F1 0.887 0.433 0.518 0.878 notstable. Macro-F1 0.451 0.487 0.415 0.438 Wealsomeasuredtheperformanceofthemethodsfordifferent GAME Micro-F1 0.518 0.545 0.501 0.518 valuesofd. Figures5(a)and5(b)showtheresultsonDBLP.As Macro-F1 0.622 0.486 0.519 0.639 d becomes larger, all approaches get better in the beginning, and Overall Micro-F1 0.648 0.537 0.563 0.668 thenconvergetoacertainperformancelevel.Observethatoverall, HINE_PCRWhasthebestperformanceinclassificationtasks. 5.4 Clustering Table6:ResultsofClusteringwithd=10 Wealsoconductedclusteringtaskstoassesstheperformanceof NMI LINE DeepWalk HINE_PC HINE_PCRW themethods. InDBLPweclustertheauthors,inMOVIEweclus- DBLP 0.3920 0.4896 0.4615 0.4870 terthemovies,inYELPweclustertherestaurants,andinGAME MOVIE 0.0233 0.0012 0.0460 0.0460 we cluster the games. We use the same ground-truth as in Sec- tion5.3.Weusenormalizedmutualinformation(NMI)toevaluate YELP 0.0395 0.0004 0.0015 0.0121 performance. GAME 0.0004 0.0002 0.0022 0.0004 Table 6 shows the results for d = 10. We can see that the Overall 0.1138 0.1229 0.1278 0.1364 performance of all methods in clustering tasks is not as stable as that in classification. All the embedding methods perform well Weconductataskofmulti-labelclassification. ForDBLP,we onDBLP,buttheyhavearelativelybadperformanceontheother usetheareasofauthorsaslabels. ForMOVIE,weusethegenres three datasets. On DBLP, DeepWalk and HINE have better per- ofmoviesaslabels. ForYELP,weusetherestauranttypesasla- formance than LINE. On MOVIE, HINE_PC and HINE_PCRW bels. ForGAME,weusethetypeofgamesaslabels. Wefirstuse outperformallotherapproaches. OnYELP,LINEhasbetterper- differentmethodstoembedtheHINs.Then,werandomlypartition formancethanHINE_PCRW,andDeepWalkhasverypoorperfor- thesamplesintoatrainingandatestingsetwithratio4 : 1. Fi- mance.OnGAME,DeepWalkoutperformstheothers.Overall,we nally,weuseknearestneighbor(k−NN)classifierswithk=5to canseethatHINE_PCRWoutperformsalltheothermethodsinthe predictthelabelsofthetestsamples. Werepeattheprocessfor10 taskofclustering. timesandcomputetheaverageMacro-F1andMicro-F1scoresto Figure 5(c) shows performance of the approaches for different evaluatetheperformance. dvaluesonDBLP.ObservethatHINE_PCRWingeneraloutper- Table3:AccuracyofNetworkRecovery(d=10) EdgeType LINE_1st LINE_2nd LINE DeepWalk HINE_PC HINE_PCRW write 0.9885 0.7891 0.8907 0.9936 0.9886 0.9839 publish 0.9365 0.6572 0.8057 0.9022 0.9330 0.9862 DBLP mention 0.9341 0.5455 0.7186 0.9202 0.8965 0.8879 cite 0.9860 0.9071 0.9373 0.9859 0.9598 0.9517 actIn 0.9648 0.5214 0.8425 0.7497 0.9403 0.9733 direct 0.9420 0.5353 0.8468 0.7879 0.9359 0.9671 MOVIE produce 0.9685 0.5334 0.8599 0.7961 0.9430 0.9900 compose 0.9719 0.5254 0.8574 0.9475 0.9528 0.9864 issue 0.9249 0.4419 0.8744 1.0000 0.5083 0.9960 towards 0.9716 0.4283 0.9221 0.5372 0.5155 0.9981 YELP mention 0.8720 0.4178 0.7738 0.5094 0.5951 0.8534 design 0.9599 0.4997 0.6781 0.9313 0.9219 0.9828 develop 0.9021 0.5043 0.7736 0.9328 0.8592 0.9843 GAME publish 0.7800 0.4719 0.6786 0.9711 0.7617 0.9756 Overall Average 0.9359 0.5556 0.8185 0.8546 0.8365 0.9655 formsallothermethods. Onlyforanarrowrangeofd(d = 10) 1.00 1.00 DeepWalk slightly outperforms HINE_PCRW (as also shown in 0.90 0.90 Table 6). Generally speaking, HINE_PCRW best preserves the proximitiesamongauthorsinthetaskofclustering. Macro-F1 00..7800 Micro-F1 00..7800 5.5 k-NNSearch 0.60 HINE_PC 0.60 HINE_PC HINE_PCRW HINE_PCRW Wecomparetheperformancesofthreemethods,i.e.,LINE,Deep- 0.50 0.50 1 2 3 4 5 1 2 3 4 5 Walk and HINE_PCRW, on k-NN search tasks. Specifically, we l l conductacasestudyonDBLP,tocomparethequalityofknearest (a) Macro-F1w.r.t.l (b) Micro-F1w.r.t.l authors for venue WWW in the embedded space. We first eval- uate the quality of k-NN search by counting the average number of papers that the authors in the k-NN result have published in 00..56 310 s) 67 WthaWttWhe,wnehaernesvtaaruytihnogrskfofruonmd1intothe10e0m(bFeidgduirneg4b(ay))H.IWNEe_cPaCnRsWee NMI 00..34 n Time ( 345 hinavWtheeemsthoperaencepusaspoeeftrLhseIpNtuoEbpla-isknhdeaDduteihneopWrWliWastlWkf.orcovmenpuaereWdWtoWtheinonthesefooruignid- 000...012 1 2 HINHE I_3NPEC_RPWC 4 5 Executio 012 2 4 6 8 10 nalDBLPnetworkasground-truth. Weusetwodifferentmetrics l l to evaluate the quality of top-k lists in the embedded space, i.e., (c) NMIw.r.t.l (d) Executiontimew.r.t.l Spearman’s footrule F and Kendall’s tau K [9]. The results are showninFigures4(b)and4(c). Wecanseethatthetop-k listof Figure7:Performancesw.r.t.l HINE_PCRWisclosertotheoneintheoriginalHIN. 5.6 Visualization givenobjectinDBLP.Specifically, weshowinTable4thetop-5 Wecomparetheperformancesofallapproachesonthetaskofvi- closestvenues,authorsandtopicstoauthorJiaweiHaninDBLP. sualization,whichaimstolayoutanHINona2-dimensionalspace. Bylookingattheresultsfortop-5venues,wecanseethatLINE Specifically,wefirstuseanembeddingmethodtomapDBLPinto doesnotgiveagoodtop-5list, asitcannotrankKDDinthetop avectorspace,then,wemapthevectorsofauthorstoa2-Dspace publishingvenuesoftheauthor. DeepWalkisslightlybetter,butit usingthet-SNE[16]package. ranksICDMatthe5thposition, whileHINE_PCRWranksKDD The results are shown in Figure 6. LINE can basically sepa- and ICDM as 1st and 2nd, respectively. Looking at the results rate the authors from different groups (represented by the same forauthors,observethatHINE_PCRWgivesasimilartop-5listto color),butsomebluepointsmixedwithothergroups.Theresultof DeepWalk,exceptthatHINE_PCRWpreferstopresearchers,e.g., DeepWalkisnotverygood,asmanyauthorsfromdifferentgroups ChristosFaloutsos. Lookingtheresultsfortopics, notethatonly aremixedtogether. HINE_PCclearlyhasbetterperformancethan HINE_PCRWcanprovideageneraltopiclike“mining”. DeepWalk. Compared with LINE, HINE_PC can better separate differentgroups. Finally,HINE_PCRW’sresultisthebestamong 5.8 Effectofthel Threshold thesemethods,becauseitclearlyseparatestheauthorsfromdiffer- Wenowstudytheeffectoflondifferentdataminingtasks,e.g., entgroups,leavingalotofemptyspacebetweentheclusters.This classificationandclustering. Figure7(a)and(b)showstheresults is consistent with the fact that HINE_PCRW has the best perfor- of classification on DBLP. We can see that 1) HINE_PCRW has manceinclassifyingtheauthorsofDBLP. quite stable performance w.r.t. l, and it achieves its best perfor- mancewhenl=2.2)TheperformanceofHINE_PCisbestwhen 5.7 CaseStudy l = 1. This is because PathCount views meta path instances of We perform a case study, which shows the k-NN objects to a different length equally important, while PCRW assigns smaller weightstothelongerinstanceswhenperformingtherandomwalk. [13] X.Kong,P.S.Yu,Y.Ding,andD.J.Wild.Metapath-based This explains why HINE_PCRW outperforms HINE_PC in these collectiveclassificationinheterogeneousinformationnetworks.In tasks. CIKM,pages1567–1571.ACM,2012. Figure7(c)showstheresultsofclusteringonDBLP.Theresults [14] N.LaoandW.W.Cohen.Relationalretrievalusingacombinationof path-constrainedrandomwalks.Machinelearning,81(1):53–67, aresimilarthoseofclassification,exceptthatHINE_PChasmuch 2010. betterperformancethanHINE_PCRWwhenl=1.Thisisnatural, [15] M.Ley.Thedblpcomputersciencebibliography:Evolution, because when l = 1, HINE_PCRW can only capture very local researchissues,perspectives.InSPIRE,pages1–10.Springer,2002. proximities.Whenl>1,HINE_PCRWoutperformsHINE_PC. [16] L.v.d.MaatenandG.Hinton.Visualizingdatausingt-sne.Journal ofMachineLearningResearch,9(Nov):2579–2605,2008. 5.9 Efficiency [17] C.Meng,R.Cheng,S.Maniu,P.Senellart,andW.Zhang. WeshowinFigure7(d)therunningtimeforcomputingall-pair Discoveringmeta-pathsinlargeheterogeneousinformation truncatedproximitiesonDBLPwiththemethoddescribedinSec- networks.InWWW,pages754–764.ACM,2015. tion4.1. Wecanseethatourproposedalgorithmcanruninarea- [18] T.Mikolov,K.Chen,G.Corrado,andJ.Dean.Efficientestimationof wordrepresentationsinvectorspace.arXivpreprint sonable time and scales well with l. Specifically, in our experi- arXiv:1301.3781,2013. ments with l = 2, the time for computing all-pair proximities is [19] T.MikolovandJ.Dean.Distributedrepresentationsofwordsand less than 1000s. In addition, note that using ASGD to solve our phrasesandtheircompositionality.pages3111–3119,2013. objective function in our experiments is very efficient, taking on [20] D.Mottin,M.Lissandrini,Y.Velegrakis,andT.Palpanas.Exemplar average27.48sonDBLPwithd=10. queries:Givemeanexampleofwhatyouneed.PVLDB, 7(5):365–376,2014. 6. CONCLUSION [21] S.Pan,J.Wu,X.Zhu,C.Zhang,andY.Wang.Tri-partydeep networkrepresentation.InIJCAI,pages1895–1901,2016. In this paper, we study the problem of heterogeneous network [22] B.Perozzi,R.Al-Rfou,andS.Skiena.Deepwalk:Onlinelearningof embeddingformetapathbasedproximity,whichcanfullyutilize socialrepresentations.InSIGKDD,pages701–710.ACM,2014. theheterogeneityofthenetwork.Wealsodefineanobjectivefunc- [23] B.Recht,C.Re,S.Wright,andF.Niu.Hogwild:Alock-free tion, which aims at minimizing the distance of two distributions, approachtoparallelizingstochasticgradientdescent.InNIPS,pages onemodelingthemetapathbasedproximities,theothermodeling 693–701,2011. theproximitiesintheembeddedlow-dimensionalspace. Wealso [24] S.T.RoweisandL.K.Saul.Nonlineardimensionalityreductionby locallylinearembedding.Science,290(5500):2323–2326,2000. investigateusingnegativesamplingtoacceleratetheoptimization [25] C.Shi,X.Kong,Y.Huang,S.Y.Philip,andB.Wu.Hetesim:A process. As shown in our extensive experiments, our embedding generalframeworkforrelevancemeasureinheterogeneousnetworks. methods can better recover the original network, and have better TKDE,26(10):2479–2492,2014. performances over several data mining tasks, e.g., classification, [26] F.M.Suchanek,G.Kasneci,andG.Weikum.Yago:acoreof clusteringandvisualization,overthestate-of-the-artnetworkem- semanticknowledge.InWWW,pages697–706,2007. beddingmethods. [27] Y.Sun,R.Barber,M.Gupta,C.C.Aggarwal,andJ.Han.Co-author relationshippredictioninheterogeneousbibliographicnetworks.In 7. REFERENCES ASONAM,pages121–128.IEEE,2011. [1] A.Ahmed,N.Shervashidze,S.Narayanamurthy,V.Josifovski,and [28] Y.Sun,J.Han,C.C.Aggarwal,andN.V.Chawla.Whenwillit A.J.Smola.Distributedlarge-scalenaturalgraphfactorization.In happen?:relationshippredictioninheterogeneousinformation WWW,pages37–48.ACM,2013. networks.InWSDM,pages663–672.ACM,2012. [2] S.Auer,C.Bizer,G.Kobilarov,J.Lehmann,R.Cyganiak,and [29] Y.Sun,J.Han,X.Yan,P.S.Yu,andT.Wu.Pathsim:Meta Z.Ives.Dbpedia:Anucleusforawebofopendata.InTheSemantic path-basedtop-ksimilaritysearchinheterogeneousinformation Web,pages722–735.Springer,2007. networks.PVLDB,4(11):992–1003,2011. [3] M.BelkinandP.Niyogi.Laplacianeigenmapsandspectral [30] Y.Sun,B.Norick,J.Han,X.Yan,P.S.Yu,andX.Yu.Pathselclus: techniquesforembeddingandclustering.InNIPS,volume14,pages Integratingmeta-pathselectionwithuser-guidedobjectclusteringin 585–591,2001. heterogeneousinformationnetworks.TKDD,7(3):11,2013. [4] K.Bollacker,C.Evans,P.Paritosh,T.Sturge,andJ.Taylor. [31] J.Tang,M.Qu,M.Wang,M.Zhang,J.Yan,andQ.Mei.Line: Freebase:acollaborativelycreatedgraphdatabaseforstructuring Large-scaleinformationnetworkembedding.InWWW,pages humanknowledge.InSIGMOD,pages1247–1250,2008. 1067–1077.ACM,2015. [5] A.Bordes,J.Weston,R.Collobert,andY.Bengio.Learning [32] J.B.Tenenbaum,V.DeSilva,andJ.C.Langford.Aglobal structuredembeddingsofknowledgebases.InAAAI,2011. geometricframeworkfornonlineardimensionalityreduction. [6] S.Cao,W.Lu,andQ.Xu.Grarep:Learninggraphrepresentations Science,290(5500):2319–2323,2000. withglobalstructuralinformation.InCIKM,pages891–900.ACM, [33] D.Wang,P.Cui,andW.Zhu.Structuraldeepnetworkembedding.In 2015. SIGKDD,pages1225–1234.ACM,2016. [7] S.Chang,W.Han,J.Tang,G.-J.Qi,C.C.Aggarwal,andT.S. [34] X.Yu,X.Ren,Y.Sun,Q.Gu,B.Sturt,U.Khandelwal,B.Norick, Huang.Heterogeneousnetworkembeddingviadeeparchitectures.In andJ.Han.Personalizedentityrecommendation:Aheterogeneous SIGKDD,pages119–128.ACM,2015. informationnetworkapproach.InWSDM,pages283–292.ACM, [8] T.F.CoxandM.A.Cox.Multidimensionalscaling.CRCpress, 2014. 2000. [35] J.Zhang,P.S.Yu,andZ.-H.Zhou.Meta-pathbasedmulti-network [9] R.Fagin,R.Kumar,andD.Sivakumar.Comparingtopklists.SIAM collectivelinkprediction.InSIGKDD,pages1286–1295.ACM, JournalonDiscreteMathematics,17(1):134–160,2003. 2014. [10] Z.Huang,Y.Zheng,R.Cheng,Y.Sun,N.Mamoulis,andX.Li. Metastructure:Computingrelevanceinlargeheterogeneous informationnetworks.InSIGKDD,pages1595–1604,2016. [11] N.Jayaram,A.Khan,C.Li,X.Yan,andR.Elmasri.Querying knowledgegraphsbyexampleentitytuples.TKDE, 27(10):2797–2811,2015. [12] M.Ji,J.Han,andM.Danilevsky.Ranking-basedclassificationof heterogeneousinformationnetworks.InSIGKDD,pages1298–1306. ACM,2011. Table4:Top-5similarlistsforJiaweiHan LINE DeepWalk HINE_PCRW k conf author topic conf author topic conf author topic 1 PAKDD JiaweiHan clustermap KDD JiaweiHan itemsets KDD JiaweiHan closed 2 PKDD GaoCong orders PAKDD XifengYan farmer ICDM XifengYan mining 3 SDM LeiLiu summarizing PKDD KeWang closed SDM KeWang itemsets 4 KDD YilingYang ccmine SDM YaboXu tsp PAKDD ChristosFaloutsos sequential 5 ICDM DaesuLee concise ICDM JasonFlannick prefixspan PKDD XiaoxinYin massive 3.0 1.0 10 LINE LINE 2.5 DeepWalk 0.8 8 DeepWalk HINE_PCRW HINE_PCRW er 2.0 g. #pap 1.5 F 00..46 3K (10) 46 v 1.0 a LINE 0.5 0.2 DeepWalk 2 HINE_PCRW 0.0 0.0 0 20 40 60 80 100 20 40 60 80 100 20 40 60 80 100 k k k (a) Avg.#paperw.r.t.k (b) F w.r.t.k (c) Kw.r.t.k Figure4:ResultsofTop-klistsforvenueWWW (a) Macro-F1 w.r.t. d (b) Micro-F1 w.r.t. d (c) NMI w.r.t. d 0.9 0.9 0.6 0.8 0.8 0.5 1 1 0.4 F 0.7 F 0.7 Macro- 0.6 Micro- 0.6 NMI 00..23 0.5 0.5 0.1 0.4 0.4 0.0 4 8 12 16 20 4 8 12 16 20 4 8 12 16 20 d d d LINE DeepWalk HEMP_PC HEMP_PCRW Figure5:PerformanceofClassificationandClusteringw.r.t.d