INTELLIGENT EMAIL: AIDING USERS WITH AI Mark Harel Dredze ADISSERTATION in Computer and Information Science PresentedtotheFacultiesoftheUniversityofPennsylvaniainPartial FulfillmentoftheRequirementsfortheDegreeofDoctorofPhilosophy 2009 FernandoPereira SupervisorofDissertation RajeevAlur GraduateGroupChairperson COPYRIGHT Mark Harel Dredze 2009 Thisdissertationisdedicatedtothememoryofmymother Helen Marlene Berkovitz Dredze, Ph.D., whowasthefirstPh.D.Iknew. iii Acknowledgements Thisthesisistheproductofbothhardworkandinvaluableassistancefromcolleaguesand friends. In my time at Penn, I have had the privilege of working with some of the finest facultyandstudents. First and foremost, I thank my adviser Professor Fernando Pereira, who has expertly guidedmyresearchandcareer. Muchcanbesaidabouthisamazingbreadthanddepthof knowledge in so many fields. Whenever I mention his name to colleagues in the field, I always hear praise of his kindness and character. This has made him a wonderful adviser andmentor. Next, I thank my committee, who have shaped both my thesis and my career in the field. ProfessorMitchMarcusandIhavesharedmanylongandfascinatingconversations, many that had nothing to do with computer science. I will miss our frequent discussions and look forward to having more in the future. Professor Ani Nenkova read my thesis with a careful eye and provided invaluable critical feedback. Our conversations gave me confidence and direction in my research. Professor Ben Taskar constantly amazes me with his depth of knowledge and rapid understanding of difficult material. I often learned something new about my own work in our discussions. I am very pleased that Professor Lise Getoor was able to join my committee and am impressed that our relatively few interactions had such a strong impact on this thesis. I look forward to continuing these conversationsandtothenewworktheywillproduce. There are many others at Penn who contributed to my success. I have been deeply privileged to have worked with Dr. Koby Crammer. Koby has been a second adviser to iv me and has taught me so much about how to conduct high quality research. One of my biggest regrets in leaving Penn is that Koby has so much more to teach and I have so much more to learn. I know that his graduate students will be truly lucky to have such a wonderfuladviser. IhaveworkedcloselywithmanyofmycolleaguesinFernando’sresearchgroup. John Blitzer was my model for a hard working and successful graduate student. Our close col- laboration helped me grow into a mature graduate student. I already miss his two-view hoodie since he graduated last year. Thank to Ryan McDonald for constantly willing to talk about and help with interesting problems; Partha Pratim Talukdar for many excited collaborations, even the ones that didn’t work; Alex Kulesza for many fascinating con- versationsthatwereawelcomedistraction;KuzmanGanchevforhiscarefulexplanations andwittyhumor;QianLiuandAxelBernal,whobothshowedpatienceastheyexplained basic biology concepts each time we spoke so that I could better understand their work. The NLP and machine learning students at Penn make it such a great graduate experi- ence: Ryan Gabbard, Emily Pitler, Qiuye (Sophie) Zhao, Annie Louis, Nikhil Dinesh, LiangHuang,JennWortman,TedSandler,KilianWeinberger,BillKandylas,JeffVaughn (honorary machine learning/NLP). I also enjoyed working with several excellent visiting students: Kedar Bellare, who really taught me CRFs, Joao Graca, who introduced me to the other Fernando Pereira, and Hanna Wallach, who spent hours talking with me about machine learning, user interfaces and women in CS. Thank you to Joel Wallenberg, a lin- guist who constantly reassured me that I did know something about linguistics and rarely laughed when I didn’t. To all of those here and to those I haven’t mentioned: you made mytimeatPennawonderfulexperience. In addition to these colleagues, I thank the wonderful CIS staff who were a constant support to both myself and other students. They ensured that things always ran smoothly andwerethereforthetimeswhenthingsdidnot. IthankMikeFelkerforhelpingmefrom myfirstvisittoPennuntilIhandedinthisthesis. Thank you to the talented and numerous undergraduate students who I have worked v with on a wide variety of fascinating projects. My work with undergraduate research has beenarewardingexperiencethroughoutmygraduatecareer. InadditiontomycolleaguesatPenn,Ihavebeenprivilegedtoworkwithmanyothers inthefield. TessaLauadvisedmeduringmyinternshipatIBMbeforeIbeganmygraduate career and has continued to mentor me throughout my graduate career. She has been my constant mentor and I look forward to continuing our rewarding conversations. Rie Ando and Tong Zhang both taught me much during my time at IBM; I am privileged to knowsuchtalentedandrespectedresearchers. Iamprivilegedtohavespenttwosummers working with world-class researchers at Google. Krzysztof Czuba, Peter Norvig and Bill Schilitareamongsomeofthefinestresearchersandpeersinthefield. Iwillalwaysrecall fondlymytimewiththeseandothercolleaguesatGoogle,IBM,andMicrosoft. I would never have even thought of pursuing a PhD had it not been for my close relationshipswithProfessorsLarryBirnbaumandKrisHammond,whoadvisedmeduring my undergraduate career at Northwestern University. I have many fond and hilarious storiesofmytimewithLarryandKris. Whilethesecolleagueshaveensuredmyacademicsuccess,somanyfriendsoutsideof schoolhavemademyyearsatPennsomeofthebestofmylife. Itwouldbeimpossibleto nameallofmyfriendsthesepastfewyears. The4105communityhasbeenmyhome. As I join its members who have left and dispersed across the globe, I know that these friends willremainwithmethroughoutmylife. Finally, I would like to thank my family for constantly supporting and caring for me. TheirencouragementandlovehasmademewhoIamtoday. Acharon acharon chaviv, I thank my dear wife Chava Evans, who has been a constant source of support and encouragement. Without her love I would never have been able to achieveallthatIhave. vi ABSTRACT INTELLIGENTEMAIL: AIDINGUSERSWITHAI MarkHarelDredze FernandoPereira User productivity and attention suffer from email overload. The human computer in- teractioncommunityhasdesignednewtypesofinterfacestofacilitateemailmanagement, including email triage, activity management, search and organization. In this work we draw ideas from machine learning and natural language processing to introduce intelli- gent email and define it as intelligent systems for supporting email interfaces. Our inter- faces are information driven, enabling users to make faster, smarter and less error prone decisionsinprocessingemail. Wedevelopintelligentemailinseveralstages. First,weexaminethecommonproblem of the forgotten attachment by building an attachment prediction system, which can sup- port different user interfaces for this problem. Next, we explore the task of email triage, the process of managing large amounts of email. We propose a reply management sys- tem supported by a reply predictor, automatically labeling messages that need a reply. To enable cross-user learning, we develop a shared set of deictic features with user specific extractionbasedonsocialnetworkanalysisofemail. Wethenexplorenewrepresentations formessagecontentbasedonlatentconceptmodels. Next,wedevelopasystemforemail activityclassificationtosupportemailactivitymanagementinterfaces. Finally,weextend thepopulartooloffacetedbrowsingtoemailbydevelopingautomaticfacetrankerstose- lectthemostusefulfacetsfordisplaytotheuser. Alargescaleevaluationandusersurvey demonstratestheeffectivenessofintelligentemailapplicationsinrealworldsettings. We also consider new learning methods useful for intelligent email: Confidence- Weighted (CW) learning. CW learning is a family of online learning algorithms where vii onlineupdatesareconfidencesensitive,favoringlargerupdatestorarerfeatures. Incorpo- rating language sensitivities improves performance on a number of NLP applications, in- cludingimportantlearningsettingsforintelligentemail. Weconsiderhowtoscalelearning intheemailsettingtoverylargedataenvironmentsthroughparalleltraining. Toreducethe cost labeling email by users, we consider active learning. We show that CW learning im- proves standard margin-based active learning. Finally, we show how confidence sensitive parametercombinationscanbeusedtoperformcross-userandmulti-domainlearning. viii Contents Acknowledgements iv 1 Introduction 1 1.1 Human-ComputerInteraction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 ArtificialIntelligenceandEmail . . . . . . . . . . . . . . . . . . . . . . 3 1.2.1 IntelligentAgents . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.3 ThesisGoals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.4 IntelligentEmail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.4.1 UserOrientedLearning . . . . . . . . . . . . . . . . . . . . . . . 8 1.4.2 UserSupportingInterfaces . . . . . . . . . . . . . . . . . . . . . 9 1.5 Confidence-WeightedLearning . . . . . . . . . . . . . . . . . . . . . . . 9 1.6 ThesisOverview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2 AttachmentPrediction 14 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2 LearningSystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 ix 3 ReplyPrediction 21 3.1 ReplyPrediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 UserInterface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 PredictionSystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.4 Challenges: Cross-UserPerformance . . . . . . . . . . . . . . . . . . . . 25 3.5 FeaturesforReplyPrediction . . . . . . . . . . . . . . . . . . . . . . . . 26 3.5.1 DeicticFeatures . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.5.2 ContentFeatures . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.5.3 FeatureConjunctions . . . . . . . . . . . . . . . . . . . . . . . . 37 3.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.6.1 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.6.2 Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4 AnalysisofRepresentationsforLearninginEmail 44 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.2 ChoosingGoodSummaryKeywords . . . . . . . . . . . . . . . . . . . . 47 4.3 LatentConceptModels . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3.1 LatentSemanticAnalysis . . . . . . . . . . . . . . . . . . . . . 48 4.3.2 LatentDirichletAllocation . . . . . . . . . . . . . . . . . . . . . 49 4.4 GeneratingSummaryKeywords . . . . . . . . . . . . . . . . . . . . . . 50 4.4.1 Query-DocumentSimilarity . . . . . . . . . . . . . . . . . . . . 50 4.4.2 WordAssociation . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.5.1 AutomatedFoldering . . . . . . . . . . . . . . . . . . . . . . . . 54 4.5.2 RecipientPrediction . . . . . . . . . . . . . . . . . . . . . . . . 58 4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 x

