MULTIMODAL SURVEILLANCE Behavior analysis for recognizing stress and aggression MULTIMODAL SURVEILLANCE Behavior analysis for recognizing stress and aggression PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Technische Universiteit Delft, op gezag van de Rector Magnificus prof. ir. K.C.A.M. Luyben, voorzitter van het College voor Promoties, in het openbaar te verdedigen op woensdag 12 maart 2014 om 10.00 uur door Iulia LEFTER Master of Science in Media and Knowledge Engineering, Delft University of Technology geboren te Bras¸ov, Roemenië. Dit proefschrift is goedgekeurd door de promotor: Prof. dr. C.M. Jonker Copromotor: Prof. drs. dr. L.J.M. Rothkrantz Samenstelling promotiecommissie: Rector Magnificus, voorzitter Prof. dr. C.M. Jonker, Technische Universiteit Delft, promotor Prof. drs. dr. L.J.M. Rothkrantz, Technische Universiteit Delft, copromotor Prof. dr. Ing. E. Nöth, University of Erlangen-Nürnberg Prof. dr. A. Hanjalic, Technische Universiteit Delft Prof. dr. H. de Ridder, Technische Universiteit Delft Prof. dr. F.W. Jansen, Technische Universiteit Delft Dr. ir. G.J. Burghouts, TNO This research project was supported by the Faculty of Military Sciences of the Netherlands Defence Academy, the Netherlands Organisation for Ap- plied Scientific Research and Delft University of Technology. ISBN 978-94-6186-288-4 Copyright 2014, Iulia Lefter. All rights reserved. Cover design by I. & M. Lefter. Acknowledgements In 2006 I came for the first time to TUDelft as an exchange student during my BSc. It was then that Leon Rothkrantz introduced me to the fascinating field of emotion recognition. It is now eight years later, I am finishing my PhD and I still find that topic as fascinating as ever. And looking back on my years as a PhD student, I realize that doing a PhD really made me ex- perience a full range of emotions. It is also at the end of the journey that I realize the most how much others affected this period, and that it was the full support that I received at critical times that kept me going on. I would like to start by expressing my gratitude to Leon Rothkrantz, my supervisor from TUDelft and NLDA. I would like to thank him for his support over the years and for being a constant source of inspiration. I always felt that because of his dedication, he never had any constraints in giving his time. It also seemed that there is no problem that doesn’t have a solution, and it was sometimes hard to keep up with all his ideas. I thank him for the insightful discussion we had while travelling to Den Helder. I am also very happy to have met his wife Fien, a constant source of joy and optimism. I enjoyed very much working together with Gertjan Burghouts, my su- pervisor from TNO. I have learned a lot from him. I appreciate the way he has guided me, caring for my development as a scientist. He reminded me every once in a while to step back and remember the big picture, taught me not to lose focus and to aim high. I very much enjoyed his enthusiasm and optimism, and talking to him about work and everything else. I thank Catholijn Jonker for being my promotor. The discussion we had areveryspecialtome. Iappreciateverymuchhersupport,herinvolvement and the way she always challenged me. I always found her ambition and joyful way of being very inspiring. This thesis would have literally not been possible without the help of many people. I very much appreciate the involvement of Pascal Wiggers in my work. Pascal is patient and helpful and I always felt that he has a unique ability to explain complex matters very clearly. I value very much the expertise that David van Leeuwen brought to parts of my work. Special v thanks to Mirela Popa, for her enormous help throughout our times as PhD students, for helping me with the annotation and for all the nice moments we shared during our trips to conferences and summer schools. My work is centered very much on recordings of a group of stand-up comedians. I have often watched over and over, trying to get inspiration, but sometimes I just found humour instead. I appreciate their participation very much. I would like to take this opportunity to thank all the participants of our experiments and annotation sessions. I also thank the MSc and BSc students I had the pleasure to supervise over the time, I had a lot to learn from that experience. SpecialthankstoZhenkeYangforproofreadingmythesisandforoffering good suggestions for improving readability. Also my thanks are intended to Maaike Belien for reviewing parts of this thesis. Furthermore, I thank all the committee members and reviewers for their valuable comments on how to improve my work. I thank Harold Nefs for his help with some of the experiments and for giving me the opportunity to continue as a researcher in the II group. I thank Siska for inspiring me ever since we met in 2006 and for all the useful pieces of advice. And of course, without Anita, Bart and Ruud, nothing would really work. The defence day is of course very special. I would like to thank Adriana and Reyhan for being my paranimphs, for all their support and encourage- ment, and of course for their friendship. As my PhD project involved a collaboration between three partners, the work was also carried out at three locations. I will always remember with pleasure the days spent at NLDA and I thank Frans Absil and the members of the SEWACO group for that. Going to NLDA was not easy, I admit it, first ofallbecauseIhadtogetupveryearlyonMonday’stogettherefromDelft. But together with the other PhD students and postdocs, we continued the practice on Tuesday’s by waking up at 6 a.m. to go to the swimming pool. I thank Dragos, Zhenke, Marina and Madalina for that and for all the other fun activities we did together, such as fencing, squash or just walking by the shore. Thanks to Ramon for being the only one who agreed to be my climbing partner. I am grateful to NLDA for providing accommodation and access to all the facilities. The days spent at TNO were also very special. I always felt welcome and I would like to thank Ronald Kersten and all the colleagues from the vi Intelligent Imaging group for that. I thank Richard, Johan-Martijn, Coen and WillemwhomIbuggedeveryonceinawhilewithsometechnicalquestions. I enjoyed sharing the room with Adam and later with Giljam, who were very patientDutchconversationpartners. TheTNOdayoutonasailingboatwith Ernst and Richard during extreme wind and rain is something I will probably not forget. I always found the idea of taking a walk around the dunes at lunch time very nice! Lateron, I had to trade that for another excellent experience, the ‘Dutch lunches’ for which I thank especially Lejla and Olga. I thank my old roommates from TUDelft: Mirela, Dragos and Zhenke, and temporary roommates Marina, Madalina and Bogdan, we really had a good atmosphere. After some time the Romanian vs. Asian balance in the room I was in changed dramatically, and I enjoy the companion of Tingting, Changyun and Ernestasia. I want to thank them for always being so posi- tive and I find our talks and intercultural exchange very enriching. Special thanks to Tingting for all the Chinese food tips and for providing me with special ingredients. During the last part of my PhD I began to spend more time at TUDelft. I very much enjoyed being part of the Interactive Intelli- gence group and I would like to thank Alex, Arman, Chris, Christina, Maaike, Reyhan, Thomas, Tim, Vanessa, Yangyang and every one else in the group for that. I also thank George, Marius and Mihai, my late lunch companions. DoingaPhDcansometimesbequitehard,butluckilyitalwaysgotbetter ‘with a little help from my friends’. I would like to thank my friends for all the good times we had together. To mention just a few: Adriana, Alex, Alin, Andra, Andreea, Andrei, Bogdan N., Bogdan S., Bogdan T., Catalin, Dana, Dragos,George,Hani,Madalina,Maria,Marina,Marius,MihaiC.,Nike,Peter, and the list could go on but I am sure you all know who you are and that you would forgive me for not making the list longer. I enjoyed all the trips, sports, gamingnightsandparties. Ialsothankmyoldanddearfriendsfrom Romania. It is amazing how time and space don’t have any impact on how we feel. I thank my parents for always being there for me. Despite the physical distance I always felt we were as close as ever. Finally, I thank Mihai for believing in me, standing by me at all times and for loving me. Iulia Lefter, Delft, February 2014. Contents Contents i 1 Introduction 1 1.1 Overview of current surveillance systems . . . . . . . . . . . . . . 2 1.2 Problem definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3.1 Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.2 Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3.3 Computational modelling. . . . . . . . . . . . . . . . . . . . . 12 1.4 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.5 List of publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5.1 Journal publications . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5.2 Peer-reviewed conference papers . . . . . . . . . . . . . . . 17 References 19 2 Automatic Stress Detection in Emergency (Telephone) Calls 23 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.2.1 Databases of emotional speech . . . . . . . . . . . . . . . . 26 2.2.2 Emotional features in speech . . . . . . . . . . . . . . . . . . 27 2.2.3 Classification techniques . . . . . . . . . . . . . . . . . . . . . 28 2.2.4 State of the art with military applications . . . . . . . . . . 29 2.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 2.3.1 The South-African database . . . . . . . . . . . . . . . . . . . 30 2.3.2 Feature set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.3.3 Classification techniques . . . . . . . . . . . . . . . . . . . . . 33 2.4 Results and interpretation . . . . . . . . . . . . . . . . . . . . . . . . 37 2.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.6 Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 i CONTENTS References 47 3 Anaudio-visualdatasetofhuman-humaninteractionsinstress- ful situations 53 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 3.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.2.1 A review of available datasets . . . . . . . . . . . . . . . . . 57 3.2.2 The relation between speech and gestures . . . . . . . . . 58 3.2.3 Gesture typologies . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2.4 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.3 Audio-visualrecordingsofhuman-humaninteractionduringstress- ful conditions at a service desk . . . . . . . . . . . . . . . . . . . . . 62 3.3.1 Data acquisition and content description . . . . . . . . . . 62 3.3.2 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.3.3 Annotations of the stress level, speech and gesture’s communicative functions and congruence . . . . . . . . . 66 3.3.4 Annotationsofemotion-relatedpropertiesforgestureclas- ses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.4 Analyses of dataset contents and annotations . . . . . . . . . . . 68 3.4.1 Stress and the congruence between speech and gestures 68 3.4.2 Analysis of gesture types. . . . . . . . . . . . . . . . . . . . . 72 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 References 79 4 Onstressanditsautomaticrecognitionusingspeechandges- tures 83 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 4.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.3.1 Model of stress expression and perception using speech and gestures from a human perspective . . . . . . . . . . . 90 4.3.2 Model for automatic stress assessment using intermedi- ate level variables . . . . . . . . . . . . . . . . . . . . . . . . . 91 4.4 Dataset of human-human interaction at a service desk . . . . . 95 4.4.1 Content description . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4.2 Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 ii
Description: