ebook img

Tracking the evolution of social emotions with topic models PDF

28 Pages·2015·2.65 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Tracking the evolution of social emotions with topic models

KnowlInfSyst DOI10.1007/s10115-015-0865-0 REGULAR PAPER Tracking the evolution of social emotions with topic models ChenZhu1 · HengshuZhu2 · YongGe3 · EnhongChen1 · QiLiu1 · TongXu1 · HuiXiong4 Received:13December2014/Revised:7April2015/Accepted:9July2015 ©Springer-VerlagLondon2015 Abstract Many of today’s online news Web sites have enabled users to specify different types of emotions (e.g., angry or shocked) they have after reading news. Compared with traditionaluserfeedbackssuchascommentsandratings,thesespecificemotionannotations are more accurate for expressing users’ personal emotions. In this paper, we propose to exploittheseusers’emotionannotationsforonlinenewsinordertotracktheevolutionof emotions, which plays an important role in various online services. A critical challenge is how to model emotions with respect to time spans. To this end, we propose a time- aware topic modeling perspective for solving this problem. Specifically, we first develop two models named emotion-Topic over Time (eToT) and mixed emotion-Topic over Time (meToT),inwhichthetopicsofnewsarerepresentedasabetadistributionovertimeanda multinomialdistributionoveremotions.Whiletheycanuncoverthelatentrelationshipamong news, emotion and time directly, they cannot capture the evolution of topics. Therefore, we further develop another model named emotion-based Dynamic Topic Model (eDTM), whereweexplorethestatespacemodelfortrackingtheevolutionoftopics.Inaddition,we demonstratethatallofproposedmodelscouldenableseveralpotentialapplications,suchas emotionprediction,emotion-basednewsrecommendations,andemotionanomalydetections. Finally,wevalidatetheproposedmodelswithextensiveexperimentswithareal-worlddata set. B EnhongChen [email protected] ChenZhu [email protected] 1 SchoolofComputerScienceandTechnology,UniversityofScienceandTechnologyofChina, Hefei230027,China 2 BigDataLab,BaiduResearch,Beijing100085,China 3 DepartmentofComputerScience,TheUniversityofNorthCarolinaatCharlotte,UniversityCity Blvd,Charlotte,NC28223-0001,USA 4 ManagementScienceandInformationSystemsDepartment,RutgersBusinessSchool, RutgersUniversity,Newark,NJ07102,USA 123 C.Zhuetal. Keywords Socialemotions·Topicmodels·Sentimentanalysis 1 Introduction With the increasing prosperity of Web 2.0, people are encouraged to have various social interactions onthe Websites. A recentdevelopmenttrendof onlinenewsWebsites, such as Yahoo! and Sina, is to allow readers to specify different types of emotions (e.g., angry andshocked)afterreadingnews.Comparedwithtraditionalusers’feedback(e.g.,reviews, tags), such specific emotion annotations are more accurate for expressing users’ personal emotions. For example, Fig. 1a shows an example of users’ aggregated emotions at Sina News,1whichhassixdifferentkindsofemotions.Eachusercanchooseoneemotion,which most accurately reflects his impression after reading, to annotate a piece of news. If we collectivelylookatandanalyzealluseremotionsfromthenewsWebsite,wemaybeable togetagoodpictureabouttheoverallemotionsofonlinesocialmediausers,namelysocial emotions.Thesocialemotionsoftenvarywithrespecttothetopicsofnewsandtimeandthus have intrinsic dynamics. For example, Fig. 1b shows the distribution of aggregated social emotionswithrespecttodifferenttimespansinourdatasetcollectedfromSinaNews.We canobservedifferentdistributionsofemotionsalongthetime,whichindicatesthatthesocial emotionsevolveovertime.Infact,suchevolutionofsocialemotionsisinherentlydrivenby thedynamictopicsofnewsatdifferenttime.Capturingsuchdynamiccharacteristicsofsocial emotions is critically important for the successful development of various social services, suchassocialopinionmonitoringandsocialeventdetections.Intheliterature,therearerecent studiesaboutsocialemotion-relatedproblems.Forexample,someworksfocusonsentiment analysis[24,35],socialemotionanalysis[6,13,18]ofonlinedocumentsanduseremotion modeling [3]. However, few of them have paid attention to the dynamic characteristics of socialemotions. To this end, in this paper, we propose to exploit the users’ emotion annotations from onlinenewstotracktheevolutionofsocialemotions.Acriticalchallengeishowtomodel emotionswithrespecttotimespans.Alongthisline,weproposeatime-awaretopicmodeling perspectiveforsolvingthisproblem.Specifically,wefirstdevelopamodelnamedemotion- TopicoverTime(eToT),wherewerepresenteachtopicofnewswithabetadistributionover timeandamultinomialdistributionoveremotions.Inthismodel,theprocessofmodeling topicsofnewsisaffectednotonlybythewordco-occurrencesbutalsotheemotionsandtime. Particularly,withsomeexperimentalobservations,wefindthatnotallthetopicshavedistinct relationshipswithtime,suchasthecommontopicsorthetopicsaboutnewsterminology. Therefore,wefurtherproposeanextensionofeToTmodel,namedmixedemotion-Topicover Time(meToT),inwhichawordcanbegeneratedfromtwodifferenttopicsets,i.e.,oneistime- independentandtheotheristime-dependent.Indeed,similartothepopularLatentDirichlet Allocation(LDA)[5],newstopicsinbotheToTandmeToTarestaticwhichdonotchange withrespecttodifferenttimespans.However,whilesomeresearchershaverevealedthatthe latenttopicsofdocumentsmayevolveastimeunfolds[4],eToTandmeToTcannotcapturethe dynamicsoftopics.Therefore,weproposeanothermodel,namelyemotion-basedDynamic TopicModel(eDTM),tocapturethedynamicsofnewstopics.IneDTM,wefirstdivideall newsintodifferentsegmentationswithrespecttotheirtimestampsandthenimplementtopic modeling for each segmentation. Models learned from different segmentations are linked togetherbytheMarkovstatespacemodel.Infact,alleToT,meToT,andeDTMcanmodel 1 http://news.sina.com.cn/. 123 Trackingtheevolutionofsocialemotionswithtopicmodels Fig.1 aExamplesofdifferentuseremotions.bThedistributionofsocialemotionswithrespecttonewsin differenttimespans theevolutionofsocialemotioneffectivelyandhelpusunderstandthesemanticrelationships betweensocialemotionandnewstopicsindifferentviews.Inaddition,wedemonstratethatall ofthemaregeneralmodelsandcouldenableseveralpotentialapplicationsofsocialmedia, such as emotion prediction, emotion-based news recommendation and emotion anomaly detection.Specially,thecontributionsofthispapercanbesummarizedasfollows. – First, we provide a comprehensive study to track the evolution of social emotions by exploitingtheuseremotionannotationsfromonlinenews.Indeed,thisstudyisvitalfor thesuccessfuldevelopmentofvarioussocialservices. – Second,weproposethreenoveltopicmodels,i.e.,eToT,meToT,andeDTM,forsolv- ingtheproblemoftrackingtheevolutionofsocialemotion.Particularly,theproposed modelscaneffectivelymodeltheemotions,newsandtimefromdifferentviews.Also, weintroducesomenovelemotion-basedapplicationsenabledbytheproposedmodels. – Third,wecarryoutextensiveexperimentsonareal-worlddatasetwhichwascollected from Sina News to evaluate the proposed models. As shown in our experiments, the proposedmodelsarealleffectiveformodelingsocialemotionsandotherenabledappli- cations. Theremainderofthispaperisorganizedasfollows.InSect.2,webrieflyintroducesome related works of this paper. Section 3 introduces the details of proposed topic models. In Sect.4,wemakesomediscussionsaboutourmodelsandintroducetheirpotentialapplica- tions.Section5reportstheexperimentalresults.Finally,weconcludethispaperinSect.6. 2 Relatedwork Therelatedworkscanbegroupedintotwocategories:sentimentanalysisandtopicmodel. Wewillgivebriefdiscussionsontheserelatedworksandtherelationshipsbetweenthemand theproposedmodelsinthispaper. 2.1 Sentimentanalysis Inthefieldofsentimentanalysis,existingresearchesmainlyfocusontwoaspects,namely sentimentclassificationandsentimentinformationextraction.Amongthem,sincesentiment extractionaimstoextractunitsthatarerelatedtoemotionatsentencelevelorparagraphlevel, itislessrelevanttoourscenario.Therefore,inthefollowing,weonlydiscusstheworkson sentimentclassification[13,16,18,26,31,37]. Classificationisaveryfundamentalprobleminmachinelearning[26,39,40].Tradition- ally,sentimentclassificationisoftenaddressedbyclassicclassificationapproachesdirectly. 123 C.Zhuetal. Recently,researchershaveproposedalotofnewapproachesforsentimentclassificationand beguntostudythereaders’emotions.Forinstance,in[17],tworankingmethodswerepre- sentedtorankreaders’emotions.Linetal.[18]alsostudiedthereaders’emotionsthatare triggeredbyarticles,andtheauthorsmainlyfocusonfeatureselection.Kozarevaetal.[13] designedanapproachforclassifyingheadlineemotionbasedontheinformationcollected fromtheWorldWideWeb.Besides,emoticon,asakindofemotionlabel,hasalsobeenused forsentimentclassification.Zhaoetal.[38]builtasystemcalledMoodLens,inwhich95 emoticonsaremappedintofourcategoriesofsentimentsforthesentimentclassificationof ChinesetweetsinWeibo.Liuetal.[20]treatedtheemoticonsaslabeleddataandintegrated itandlabeleddataintoauniformsentimentclassificationframework. We can find that most of existing works aim to study the sentiment directly from the documents (words). Since there is a consensus that a document is composed of topics, it isalsoreasonabletotreatsentimentofadocumentasacombinationofsentimentonthese topics. Therefore, in this paper, we focus on analyzing sentiment from the perspective of thesentiment(emotion)ofthetopics.Therehavebeensomeworksabouttheemotionwith respecttotopics.Forinstance,Meietal.[22]modeledtextsthroughamixtureoftopicmodel andsentimentmodel.LinandHe[16]alsoextendedLDAtodetectsentiment andtopics. However,noneofthemconsidersthetemporaleffect. 2.2 Topicmodels Topicmodelisatoolforanalyzingtextsorothertypesofdiscretedata.Manydifferentkinds of topic models [4,5,21,33] have been proposed, among which Latent Dirichlet Alloca- tion(LDA)[5]isverypopular.Actually,themodelsproposedinthispapercanbealsotraced backtoLDA.ThebasicassumptionofLDAisthateachdocumentcanbetreatedasamixture oftopicsandthegenerationofwordsinthedocumentisattributabletooneofthosetopics. However,LDAisastaticmodelthatdoesnotconsidertheinfluenceoftime,soitisinap- propriatetobeadoptedinmanyapplications,includingsocialemotionanalysis.Tothisend, severaltopicmodelshaveexaminedtopicsandtheirchangeswithtime.Forinstance,Topics overTime(ToT)[33]modelassociateseachtopicwithabetadistributionovertimetodirectly capturethetopicchangesovertime.TherelativesimplicityofToTmakesiteasytoinjectits ideastoothertopicmodels.DifferentfromToT,DynamicTopicModel(DTM)[4]usesstate space model on the natural parameters to model the evolutionof topic and the variational approximationsdependingonKalmanfiltersandwaveletregression.DTMrequiresthattime mustbediscretized,andthushowtodeterminethelengthoftimespanisanimportantprob- lem. Wang et al. [32] solved this problem by replacing state space model with Brownian motionwhichisactuallyanacontinuousgeneralizationofstatespacemodel.Anotherwayto solvethisproblemwasproposedin[11],whichconsidersmulti-scaletimespantotrackthe topicchanges.TopicTrackingModel(TTM)[9]isanotherDynamicTopicModelfortrack- ingconsumerpurchasebehaviors.ComparedwithDTM,itsimplifiesthemodelbyreplacing GaussiandistributionwithDirichletdistributionsothattheparameterscouldbeestimatedby Gibbssamplingwhichissimplercomparedwithvariationalmethods.Insummary,thereare twomainapproachestomodeltheevolutionoftopicsovertime:treatingtimeasanattribute oftopics,e.g.,ToT;usingstatespacemodelorBrownianmotiononnaturalparameters,e.g., DTMandTTM. Meanwhile,someworkextendedLDAfromanotherperspective.Theassumptionbehind them, which is similar to that of meToT, is each word could come from several different topicsets.Beyondbagofwords,Wangetal.[34]introducedphrasesandproposedtopical n-grams,inwhichawordmaycomefromatopic-specificunigramdistributionorabigram 123 Trackingtheevolutionofsocialemotionswithtopicmodels distribution.Followingsimilaridea,Pualetal.[27]triedtodetectculturaldifferencesfrom online blogs and forums. They assumed that an article from a special cultural group is a mixtureofsomegeneraltopicsandculture-specialtopics.Besides,someresearchershave modeledsentimentbythisway.Forinstance,TitovandMcDonald[31]proposedaMulti- AspectSentimentmodelconsistingofanextensionofLDA,Multi-GrainLDA,andsentiment predictorsforsentimentsummarization. Asintroducedabove,therehavebeensomeworksaimingtoanalyzesentimentbytopic model,butnoneofthemfocusonanalyzingsocialemotionswithrespecttotemporalinfor- mation. 3 Modelingsocialemotion In this section, firstly, some preliminaries and backgrounds will be introduced. Then, we explainthedetailsofourtime-awaretopicmodelingsolutionsforanalyzingsocialemotion. 3.1 Preliminariesandbackgrounds Recently,manyoftodaysonlinenewsWebsiteshaveenableduserstospecifytheiremotions afterreadingnews.Therefore,foreachpieceofnews,wecangetnotonlythetextualcon- tentandtimestampbutalsotheemotionthatwasannotatedbyusers.Forexample,Fig.1a illustratestheemotionannotationsfrom75users,andtheseemotionsareclassifiedintosix categories,i.e.,“Funny,”“Touched,”“Angry,”“Sad,”“Novel,”and“Shocked.”Asshownin Fig.1a,allemotionannotationsofapieceofnewsfromdifferentusersareusuallyaggregated together. Forsocialemotionmodeling,intuitively,users’emotionaboutadocument(e.g.,news)is mainlydecidedbytheiremotionaboutthetopicsofthedocument,whichcouldbeextracted bytopicmodel.Indeed,topicmodelisanunsupervisedmethodtoeffectivelyextracttopics fromthetextsbasedontheco-occurrenceofwords.Consideringthatdifferenttopicslead to different user emotions, e.g., some topics tend to depress people but some other topics maybeveryaffecting,itisreasonabletoassociateeachtopicwithadistributionwithrespect toemotion.Inthisway,wecanmodeltheevolutionofsocialemotionsbyintroducingan emotionandtimegenerationlayerintothetopicmodels,inwhichthediscoveryoftopicsis influencedbysocialemotions,co-occurringofwords,andalsotemporalinformation. Formally,weassumethatadocumentd isabagofwordsw ,sizeofwhichis N ,and d d w is from a vocabulary containing V words. Each document also has a timestamp t and d an observed multinomial distribution over emotions e. Thus, we can use a set of triples D = {(w ,t ,e ),...,(w ,t ,e )} to represent a collection of D documents. Next, we 0 0 0 D D D willexplainoursolutionstomodeltheevolutionofsocialemotionfromthecorpusD. 3.2 Emotion-topicovertime In this subsection, we would propose a novel topic model named emotion-Topic over Time(eToT)todirectlyuncoverthelatentrelationshipamongdocuments,emotion,andtime. ComparedwithLDA,eachtopicdiscoveredbyeToTalsohasadistributionwithrespectto timeandadistributionwithrespecttoemotion.Thatis,besidesunderstandingthecontent ofatopic,wecanknowthepeople’simpressionofthistopic(i.e.,socialemotion)andwhen thistopicemerges. 123 C.Zhuetal. Wefirstintroducethewaytoconnectsocialemotionswithtopics.Ithasbeenintroduced brieflyabovethatweassumetheusers’emotionsaboutaspecificdocumentcomefromtheir emotionsaboutthetopicsofthisdocumentandeachtopicshouldownaemotionaltilt.Thus, itisreasonabletoassumethateachtopichasalatentdistributionwithrespecttoemotionin eToT.Notethat,theemotionsarediscreteandthenumberofeachkindofemotionscouldbe observedforanydocuments(news).Therefore,givenanewsarticle,wecouldeasilygetthe distributionoveremotionsewithrespecttothisnews.Here,eisobserved,whiletheemotion distributionsontopicsη,whichdeterminee,arelatent.Specifically,weassumethislatent distributionfollowstheDirichletdistribution,i.e.,eachtopick hasaDirichletdistribution η with respect to emotion. Dirichlet distribution is a family of continuous multivariate k probabilitydistributionsoversimplexofwhichthesummationisequaltoone.Thus,itisthe appropriatechoicetomodelthedistributionofemotions.Givenatopick,theprobabilityof anobservedemotiondistributionecouldbecalculatedasfollows: (cid:2)(cid:3) (cid:4) P(e|ηk)= (cid:3)(cid:5)lE=1lE=(cid:3)1(ηηkk,,ll) ·l(cid:5)=E1elηk,l−1, (1) whereη areparametersofDirichletdistributionandtheinferenceforthemwillbeshown k later,andeisthevectorofproportionofemotions.Eisthenumberofcategoriesofemotions andisalsothesizeofvectorη .(cid:3)(·)isgammafunction. k Similarly,wealsodirectlyassociatetimewithtopics,i.e.,eachtopick hasalatentdis- tributionψ withrespecttotime.FollowingtheideasinToT,weselectbetadistributionas k ψ , which could behave versatile shapes. Therefore, we need to normalize the timestamp k intoarangefrom0to1firstly.Apointweshouldpayattentiontoishowtonormalizethe timestamps.Indeed,thetimestampscanbenormalizedindifferentways,whichdependson differentapplicationrequirements.Tobespecific,iftheyarenormalizedwithrespecttothe entiretimerange,themodelcancapturethelong-termrelationshipsbetweentimes,emotion, and topics, which can be used for social opinion monitoring. In contrast, if we normalize them with respect to a specific time range (e.g., week or month), the model can capture the periodical effects of social emotions (e.g., the emotions during Christmas each year), whichcanbeusedforpredictingsocialemotionsinthefuture.Particularly,forsimplicity, wejustchoosetheentiretimerangefornormalizingtimestamps.Then,givenatopick,the probabilityofanobservedtimestampt couldbecalculatedasfollows: P(t|ψk)= (cid:3)(cid:3)((ψψkk,1,1)+(cid:3)(ψψkk,2,2)) ·(1−t)ψk,1−1tψk,2−1, (2) whereψ areparametersofbetadistributionandtheinferenceforthemwillbeshownlater. k Note that, for simplifying the process of parameter estimation, we generate emotions and timestamp with respect to each word token. That is, we assume all word tokens in one document share the same emotion distribution e and the same timestamp t with the document.Insummary,thecorrespondinggraphicalmodelofeToTisshowninFig.2,and itsparameterizationsare θ |α ∼Dirichlet(α), d φ |β ∼Dirichlet(β), k zd,i|θd ∼Multinomial(θd), wd,i|φzd,i ∼Multinomial(φzd,i), td,i|ψzd,i ∼Beta(ψzd,i), ed,i|ηzd,i ∼Dirichlet(ηzd,i). 123 Trackingtheevolutionofsocialemotionswithtopicmodels Fig.2 ThegraphicalrepresentationofeToT Thus,thegenerativeprocessforadocumentd isasfollows: 1. Drawamultinomialdistributionθ overtopicsfromaDirichletpriorα; d 2. Drawparametersoftopic-wordmultinomialdistributionsφ fromaDirichletpriorβ; k 3. Forawordtokeni indocumentd: (a) Drawatopiczd,i fromMult(θd); (b) Drawawordwd,i fromMult(φzd,i); (c) Drawatimestamptd,i fromBeta(ψzd,i); (d) Drawamultinomialdistributioned,i overemotionsfromDirichlet(ηzd,i); WeemployGibbssamplingtoestimateparameters,i.e.,to“invert”thegenerativeprocess and“generate”latentvariablesfromgivenobservations.Forsimplicity,weestimateψ and ηonceperiterationofGibbssampling.Thejointdistributionoftopicz,timet,wordw,and emotioneisasfollows: (cid:6)D (cid:6)Nd P(w,t,z,e|α,β,η,ψ)= (P(td,i|ψzd,i)P(ed,i|ηzd,i))· d=1i=1 (3) (cid:6)K (cid:9)(nz+β) (cid:6)D (cid:9)(nd +α) . (cid:9)(β) (cid:9)(α) z=1 d=1 In Gibbs sampling procedure, what we need to calculate is the conditional distribution P(zd,i =k|w,t,z¬d,i,e,α,β,(cid:10),η)wherez¬d,i meansthetopicsforallofthewordtokens except wd,i. Because of conjugation, the conditional distribution in each iteration can be writtenas: P(zd,i =k|w,t,z¬d,i,e,α,β,(cid:10),η)∝ (cid:3) ndk(cid:2),¬d,i+αk (cid:4) · (cid:3)nkwd,i(cid:2),¬d,i+βwd,i(cid:4)· zK⎛=1 ndz(cid:2),¬(cid:3)d,i+αz (cid:4)vV=1 nkv,¬d,i+β⎞v (4) (1−td,i)ψk,1−1(td,i)ψk,2−1 ·⎝(cid:3)(cid:5) lE=1ηk,l ·(cid:6)E eηk,l−1⎠. B(ψk,1,ψk,2) lE=1(cid:3)(ηk,l) l=1 d,i,l Thenotationna referstothenumberoftimesbhasbeenassignedtoa.Forexample,nd is b k thenumberofwordtokensindocumentd thatareassignedtotopick,nkv isthenumberof wordvassignedtotopick,anded,i,l isthelthemotionproportionoftheithwordtokenin documentd.AftereachiterationofGibbssampling,weupdateψ andηas (cid:11) (cid:12) ψkˆ,1 =t¯k t¯k(S1(−t)2t¯k) −1 , (5) k 123 C.Zhuetal. (cid:11) (cid:12) ψkˆ,2 =(1−t¯k) t¯k(S1(−t)2t¯k) −1 , (6) k (cid:11) (cid:12) ηkˆ,l =ek¯,l ek¯,1S(1(e−)2k,e1k¯,1) −1 , (7) wheret¯ and S(t)2 arethesamplemeanandthebiasedsamplevarianceofthetimestamps k k ofwordtokensbelongingtotopick,respectively.Similarly,ek¯,l and S(e)2k,l arethesample meanandbiasedsamplevarianceoftheithemotionofwordtokensbelongingtotopick. AftertheprocessofGibbssampling,θd,k andφk,i canbeevaluatedasfollows: nd +α θdˆ,k = (cid:3)K k(nd +k α ), (8) z=1 z z nk +β φkˆ,i = (cid:3)vV=1i(nkv+i βv). (9) By eToT, we could extract topics and some additional information about them from a collectionoftexts.Thoughthetopicsareconstant,wegettheirdistributionswithrespectto timeψ anddiscretedistributionswithrespecttoemotionsη.Bythesedistributions,wecan understandtherelationshipsamongemotions,topics,andtime.Andthetemporalinformation couldhelpustodiscoverthoseemergencieseffectively[33],whichwill alsobeshownin experimentswell.Atlast,notethat,thegenerationprocessofoureToTmodelissimilartothe ToTmodel[33].Indeed,ineToT,topicdiscoveryisaffectedbyalloftheeffectsfromword co-occurrences,emotion,andtemporalinformation,whileToTonlyconsiderstheinfluence fromwordsandtime.SoToTcouldnotbeusedforsentimentanalysisandmodelingsocial emotions. 3.3 Mixedemotion-topicovertime IneToT,weassociatealltopicswithtime,whichmeansalltopicshavedistinctrelationships withtime.However,basedonourpreliminaryexperiments,weobservethatthetimedistrib- utionsofsometopicsdiscoveredbyeToTareverysmooth,suchasthecommontopicsorthe topicsaboutnewsterminology.Therefore,wearguethatsometopicsarenotsensitivetotime. Followingthisidea,weproposeanextensionofeToTmodel,namedmixedemotion-Topic overTime,inwhichtherearetwosetsoftopics,i.e.,oneistime-independentandtheother istime-dependent. InmeToT,weemploythesamemethodsofeToTtoassociatetopicswithtimeandemotion, whileeachdocumenthastwotopicdistributions,θ andθ ,correspondingtotwosetsoftopics 1 2 φ andφ .Althoughbothofthemhaveworddistributionsandemotiondistributions,onlyφ 1 2 1 ownsbetadistributionswithrespecttotime.Thus,onlyφ istime-dependent.Thegraphical 1 representation of meToT is shown in Fig. 3. Similar to eToT, we also generate emotions andtimestampswithrespecttoeachwordtoken.However,weemployanadditionalmixture componenttodistinguishwhereeachwordisgeneratedfrom,time-dependenttopicsortime- independenttopics,byanewsetofdiscretevariablesx.InmeToT,eachwordtokenhastwo topiclabels,z andz .Ifthestateofi-thwordtoken,x ,is1,thiswordandemotioncomefrom 1 2 i z andviceversa.Becausethetimestampisonlyrelatedtoθ ,ifx =1,thetimestampcould 1i 1 i bedrawnfromcorrespondingηbutifx =0,wedrawthetimestampfromabackgroundtime i distribution, Beta(ψ ),whichisauniformdistributionactually.Meanwhile x issampled 0 i 123 Trackingtheevolutionofsocialemotionswithtopicmodels Fig.3 ThegraphicalrepresentationofmeToT fromthebinomialdistributionρz1i,z2i.Andγ isthepriorofρ.ThemeToTparameterizations are θ |α ∼Dirichlet(α), 1d θ |κ ∼Dirichlet(κ), 2d φ |β ∼Dirichlet(β), k ρ|γ ∼Beta(γ), l z |θ ∼Multinomial(θ ), 1d,i 1d 1d z |θ ∼Multinomial(θ ), 2d,i 2d 2d xd,i|ρz1d,i,z2d,i ∼Bernoulli(ρz1d,i,z2d,i), wd,i|φz∗d,i ∼Multinomial(φz∗d,i), td,i|ψz∗d,i ∼Beta(ψz∗d,i), ed,i|ηz∗d,i ∼Dirichlet(ηz∗d,i), where ∗ in z∗d,i is a label, which could be set as 0 or 1, indicating whether it is a time- dependent topic or time-independent one. The generative process for a document d is as follows: 1. Drawamultinomialdistributionθ overtopicsfromaDirichletpriorα; 1d 2. Drawamultinomialdistributionθ overtopicsfromaDirichletpriorκ; 2d 3. Forawordtokeni indocumentd: (a) Drawatopicz fromMult(θ ); 1d,i 1d (b) Drawatopicz fromMult(θ ); 2d,i 2d ((dc)) DIfrxadw,ix=d,i1f:romBernoulli(ρz1d,i,z2d,i); i Drawawordwd,i fromMult(φz1d,i); ii Drawatimestamptd,i fromBeta(ψz1d,i); iii Drawamultinomialdistributioned,i overemotionsfromDirichlet(ηz1d,i); 123 C.Zhuetal. (e) ifxd,i =0: i Drawawordwd,i fromMult(φz2d,i); ii Drawatimestamptd,i fromBeta(ψ0); iii Drawamultinomialdistributioned,i overemotionsfromDirichlet(ηz2d,i); We also infer the topic assignments by Gibbs sampling. The corresponding parameter inferenceissimilartoeToT.Theconditionaldistributionisasfollows: P(z1d,i =k1,z2d,i =k2,xd,i =s|w,z1¬d,i,z2¬d,i,x¬d,i,t,e,α,β,ψ,η) ∝ (cid:2)(cid:3) ndk1,¬d,i +αk1(cid:4) (cid:2)(cid:3) ndk2,¬d,i +γk2(cid:4) nks1,k2 +γ0 ⎧ zK=11ndz,¬d,i +αz −1 zK=21ndz,¬d,i +γz −1nkx1=,k02 +nkx1=,k12 +γ0+γ1 ·⎪⎪⎪⎪⎨(cid:3)nvVkw=2d1,i(cid:2),¬ndkv2,,¬i+d,βi(cid:4)w+d,βivP(ed,i|ηk2)P(td,i|ψ0), if xd,i =0, (10) ⎪⎪⎪⎪⎩(cid:3)nvVkw=1d1,i(cid:2),¬ndkv1,,¬i+d,βi(cid:4)w+d,βivP(ed,i|ηk1)P(td,i|ψk1), if xd,i =1. The approach to evaluate φ,η,ψ is similar to that of eToT. With emToT, we can better analyzetherelationshipsamongemotion,time,andtopics.Inexperiments,wewouldshow thatmeToTcoulddistinguishstabletopicsanddynamiconesintermsoftime. 3.4 emotion-basedDynamicTopicModel AlthougheToTandmeToTcanuncoverthelatentrelationshipamongdocuments,emotion, andtime,theconstanttopicslearnedbyeToTcouldnotcapturethedynamicsofnewstopics. Tothisend,inthissubsection,weproposeanothermodel,emotion-basedDynamicTopic Model(eDTM),tosolvethisproblem. Differentfromthem,eDTMdiscoverstopicsbythecontentandemotioninformationin eachtimespan,andthetopicsindifferenttimespansarechainedbystatespacemodel.In this way, eDTM could learn the distribution over words and the distribution with respect to emotion for each topic in different time spans. Thus, the evolution of news topics is demonstratedclearlyanddirectlybyeDTM.Here,weassumethedistributionoverwordsof atopicshouldevolvesmoothly,whileemotionmaynot.Actually,thestudyonsocialscience hasshownthatthepopularmindisweird[1,14].Sometimes,itisbigoted,conservative,and imperious. At other times, it shows some totally different characteristics such as impulse, fickleness,andirritability. Thus,wejustchaintopicsdependingontheirdistributionsoverwords.Duetothestate space model, we need firstly divide texts into different time spans by timestamps. Then eDTMmodelsthedocumentsintimespant withatopicmodelwhichhas K topics,where these topics evolve from the topics associated with the time span t − 1. For simplicity, we use Dirichlet distribution to chain the neighboring topic models on the parameters of the multinomial distributions of topics over word tokens φ. We could also employ Gibbs samplingtoevaluatetheseparameters.Specifically,thesequentialstructureis P(φt,k|φt−1,k,λt,k)∝ (cid:5)V φtλ,tk,k,vφt−1,k,v−1, (11) v=1 whereλisaparametertocontroltheinfluenceofthetopicmodelintimespant −1onthe topicmodelintimespant. 123

Description:
two models named emotion-Topic over Time (eToT) and mixed emotion-Topic over Time. (meToT), in which the topics .. wd is from a vocabulary containing V words. The CEO of Alibaba are praised by people for his benefaction.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.