Studies in Computational Intelligence 602 Mohamed Medhat Gaber Mihaela Cocea Nirmalie Wiratunga Ayse Goker Editors Advances in Social Media Analysis Studies in Computational Intelligence Volume 602 Series editor Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland e-mail: [email protected] About this Series The series “Studies in Computational Intelligence” (SCI) publishes new develop- mentsandadvancesinthevariousareasofcomputationalintelligence—quicklyand with a high quality. The intent is to cover the theory, applications, and design methods of computational intelligence, as embedded in the fields of engineering, computer science, physics and life sciences, as well as the methodologies behind them. The series contains monographs, lecture notes and edited volumes in computational intelligence spanning the areas of neural networks, connectionist systems, genetic algorithms, evolutionary computation, artificial intelligence, cellular automata, self-organizing systems, soft computing, fuzzy systems, and hybrid intelligent systems. Of particular value to both the contributors and the readership are the short publication timeframe and the worldwide distribution, which enable both wide and rapid dissemination of research output. More information about this series at http://www.springer.com/series/7092 Mohamed Medhat Gaber Mihaela Cocea Nirmalie Wiratunga (cid:129) Ayse Goker Editors Advances in Social Media Analysis 123 Editors MohamedMedhat Gaber Nirmalie Wiratunga Schoolof Computing Science andDigital Schoolof Computing Science andDigital Media Media RobertGordon University RobertGordon University Aberdeen Aberdeen UK UK Mihaela Cocea Ayse Goker Schoolof Computing Schoolof Computing Science andDigital University of Portsmouth Media Portsmouth RobertGordon University UK Aberdeen UK ISSN 1860-949X ISSN 1860-9503 (electronic) Studies in Computational Intelligence ISBN978-3-319-18457-9 ISBN978-3-319-18458-6 (eBook) DOI 10.1007/978-3-319-18458-6 LibraryofCongressControlNumber:2015939149 SpringerChamHeidelbergNewYorkDordrechtLondon ©SpringerInternationalPublishingSwitzerland2015 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodologynowknownorhereafterdeveloped. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor foranyerrorsoromissionsthatmayhavebeenmade. Printedonacid-freepaper SpringerInternationalPublishingAGSwitzerlandispartofSpringerScience+BusinessMedia (www.springer.com) Preface We are happy to present these carefully selected research projects in the area of social media analysis that are organised into seven chapters. The chapters are diverse enough to provide the reader with insights into current research directions. However, owing to the importance of sentiment analysis in social media, there are six chapters that provide the readers with different techniques in this continuously growingarea.Theotherchapterprovidesanimportantresearchdirectiononhowto detect newsworthy topics from social media websites. Erik Tromp and Mykola Pechenizkiy in the chapter “Pattern-Based Emotion ClassificationonSocialMedia”adoptPlutchikswheelofemotionsmodelandtheir long-standingrule-basedemotiondetectionmethodtoclassifyavarietyofemotions on social media. Carlos Martin, David Corney and Ayse Goker in the chapter “MiningNewsworthyTopicsfromSocialMedia”providethereaderwithanumber of information retrieval and data mining techniques that are able to identify newsworthy contents in social media websites. Gizem Gezici, Berrin Yanikoglu, DilekTapucuandYücelSaygıninthechapter“SentimentAnalysisUsingDomain- Adaptation and Sentence-Based Analysis” motivate sentence-based sentiment analysis as opposed to the lexicon-based approach adopted in a large number of sentiment analysis techniques. In the chapter “Entity-Based Opinion Mining from Text and Multimedia”, Diana Maynard and Jonathan Hare prove empirically how multimedia can help resolve the ambiguity of opinion. Such multimodal approach hasgrowinginterestwithallmajorsocialmediawebsitesprovidingmeansofusing multimediaintheusersposts.AminuMuhammad,NirmalieWiratungaandRobert Lothianinthechapter“Context-AwareSentimentAnalysisofSocialMedia”argue that local and global contexts can enhance the performance of sentiment analysis, which has been experimentally proven. In the chapter “Case-Studies in Mining User-Generated Reviews for Recommendation”, Ruihai Dong, Michael P. O’Mahony, Kevin McCarthy and Barry Smyth combine topic detection and sentimentanalysisforfilteringusefulreviewsandproductrecommendation.Zheng Yuan and Matthew Purver in the chapter “Predicting Emotion Labels for Chinese Microblog Texts” provide experimental work on predicting emotion in a Chinese v vi Preface microbloggingwebsite,namelySinaWeibousingn-gramfeatures,ofwhichhigher orders proved to be useful in enhancing the prediction of the emotion. This volume can serve the audience from both academia and industry, looking for new advances in the area of social media analysis. We hope that the presented chapters open up opportunities for future research. February 2015 Mohamed Medhat Gaber Contents Pattern-Based Emotion Classification on Social Media . . . . . . . . . . . . 1 Erik Tromp and Mykola Pechenizkiy Mining Newsworthy Topics from Social Media. . . . . . . . . . . . . . . . . . 21 Carlos Martin, David Corney and Ayse Goker Sentiment Analysis Using Domain-Adaptation and Sentence-Based Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Gizem Gezici, Berrin Yanikoglu, Dilek Tapucu and Yücel Saygın Entity-Based Opinion Mining from Text and Multimedia . . . . . . . . . . 65 Diana Maynard and Jonathon Hare Context-Aware Sentiment Analysis of Social Media. . . . . . . . . . . . . . . 87 Aminu Muhammad, Nirmalie Wiratunga and Robert Lothian Case-Studies in Mining User-Generated Reviews for Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Ruihai Dong, Michael P. O’Mahony, Kevin McCarthy and Barry Smyth Predicting Emotion Labels for Chinese Microblog Texts . . . . . . . . . . . 129 Zheng Yuan and Matthew Purver Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 vii Pattern-Based Emotion Classification on Social Media ErikTrompandMykolaPechenizkiy Abstract Sentimentanalysiscangobeyondthetypicalgranularityofpolaritythat assumes each text to be positive, negative or neural. Indeed, human emotions are much more diverse, and it is interesting to study how to define a more complete set of emotions and how to deduce these emotions from human-written messages. In this book chapter we argue that using Plutchik’s wheel of emotions model and arule-basedapproachforemotiondetectionintextmakesitagoodframeworkfor emotionclassificationonsocialmedia.Weprovideadetaileddescriptionofhowto definerule-basedpatternsforPlutchik’swheelemotiondetection,howtolearnthem fromtheannotatedsocialmediaandhowtoapplythemforclassifyingemotionsin the previously unseen texts. The results of the experimental study suggest that the describedframeworkispromisingandthatitadvancesthecurrentstate-of-the-artin emotiondetection. 1 Introduction Sentimentanalysiscanbeperformedatdifferentlevelsofgranularity;thedocument level[17,26],wordlevel[12]orthesentenceorphraselevel[24],andwithdifferent levelsofdetail;determiningthepolarityofamessageortheemotionexpressed[22]. Whensentimentanalysisisperformedonsocialmedia,inwhichasinglemessage B E.Tromp( ) AdversitementB.V.,Uden,TheNetherlands e-mail:[email protected] M.Pechenizkiy EindhovenUniversityofTechnology,Eindhoven,TheNetherlands e-mail:[email protected] http://www.win.tue.nl/~mpechen/ E.Tromp DepartmentofComputerScience,TUEindhoven, P.O.Box513,5600MBEindhoven,TheNetherlands ©SpringerInternationalPublishingSwitzerland2015 1 M.M.Gaberetal.(eds.),AdvancesinSocialMediaAnalysis, StudiesinComputationalIntelligence602, DOI10.1007/978-3-319-18458-6_1 2 E.TrompandM.Pechenizkiy typicallyconsistsofoneortwosentences,westudyhowsentimentisexpressedat thesentencelevel. Currentsentimentanalysismethods—rangingfrombaselinebag-of-wordsmeth- odstostate-of-the-artrecursiveneuralnetworks[28]—typicallyfocusondeducing information on subjectivity or polarity only (Sect.4). Human emotions move far beyond these simple metrics and are much more diverse. This implies that such subjectivity-orpolarity-analysisonlygiveslimitedinformationontheactualintent ofanauthorofamessage. Defining axes of polarity is not a hard task, typically one has negativity, posi- tivity and a notion of neutrality or objectivity in between. For emotions however, definingacompleteandclearsetofemotionsismuchmoredifficult.Thoughseveral researchersattemptedatdefiningstandardsinthisfield[20, 21, 25],AAAC,1 there isstillnoconsensusonabasicsetofemotionsthatisgenerallyacceptedandcould beobjectivelyverified. Thegoalofthischapteristopresentasentimentanalysisapproachaccompanied by a model of emotions that fit well together in order to set a standard in emotion analysistoexpandupon. To achieve this goal we do not seek to define or implement our own model of emotions, but choose an existing psychological model of basic emotions that is manageableyetapplicabletoanygivendomainorlanguage.Weprovidemotivation why the wheel of emotions defined by Plutchik [21] is suitable for our purpose. Besidesthemodelofemotions,weaimtodefineanalgorithmthatallowstodeduce theseemotionsfromhuman-writtentexts. WepresentanewRBEM-Emoapproach[31]foremotiondetectionfromhuman- writtentexts.Thisalgorithmisbasedonworkby[30]wheretheRule-BasedEmission Model(RBEM)algorithmforpolaritydetectiononlywasintroduced.RBEMgener- atespositiveandnegativeemissionsbasedonseveralgroupsofpatternsthatcapture variouswayshowsentimentcanbeexpressedinnaturallanguage. In [30] we extensively experimented with RBEM on English and Dutch mes- sagesextracted fromTwitter.Theexperiments demonstratethatdesigningsuchan algorithminsteadofapplyingthestate-of-theartgeneralpurposeclassificationtech- niquesisareasonablechoicefortheautomatedsentimentclassificationinpractice. Using RBEM we were able to design a competitive sentiment classification sys- tem showing promising accuracy results close to 80% on the considered datasets. WealsoillustratedthatRBEMcanbeusedinmultilingualsettingsandisapplica- bletosocialmediacharacterizedbyuseofnotalwaysregularlanguageconstructs. Besides,weprovidedsomefurtherevidencethatRBEM-basedsystemsareeasyto debug,improveovertimeandadapttonewapplicationdomains,forwhichnoprevi- ouslyannotateddatawereavailable.Thisisratherimportantinpracticetooasuseof languageishighlydependentuponthedomaininwhichitisbeingused.Assuch,it isexpectedthatagenericallytrainedmodeldoesnotperformaswellasitshouldon aspecificdomainandthatdomain-specificmodelsdonotportwelltootherdomains. RBEMinfiercecontrasttogeneral-purposestate-of-the-artclassificationtechniques 1TheAssociationfortheAdvancementofAffectiveComputing—http://emotion-research.net/.