ebook img

Collaborative Annotation for Reliable Natural Language Processing: Technica PDF

196 Pages·2016·3.71 MB·English
by  Fort
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Collaborative Annotation for Reliable Natural Language Processing: Technica

FOCUS COGNITIVE SCIENCE SERIES Collaborative Annotation for Reliable Natural Language Processing Technical and Sociological Aspects Karën Fort Collaborative Annotation for Reliable Natural Language Processing FOCUS SERIES Series Editor Patrick Paroubek Collaborative Annotation for Reliable Natural Language Processing Technical and Sociological Aspects Karën Fort First published 2016 in Great Britain and the United States by ISTE Ltd and John Wiley & Sons, Inc. Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms and licenses issued by the CLA. Enquiries concerning reproduction outside these terms should be sent to the publishers at the undermentioned address: ISTE Ltd John Wiley & Sons, Inc. 27-37 St George’s Road 111 River Street London SW19 4EU Hoboken, NJ 07030 UK USA www.iste.co.uk www.wiley.com © ISTE Ltd 2016 The rights of Karën Fort to be identified as the author of this work have been asserted by her in accordance with the Copyright, Designs and Patents Act 1988. Library of Congress Control Number: 2016936602 British Library Cataloguing-in-Publication Data A CIP record for this book is available from the British Library ISSN 2051-2481 (Print) ISSN 2051-249X (Online) ISBN 978-1-84821-904-5 Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Acronyms . . . . . . . . . . . . . . . . . . . . . . . . xi Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Chapter 1. Annotating Collaboratively . . . . . . . . . . 1 1.1.The annotation process (re)visited . . . . . . . . . 1 1.1.1.Building consensus . . . . . . . . . . . . . . . . 1 1.1.2.Existing methodologies . . . . . . . . . . . . . . 3 1.1.3.Preparatory work . . . . . . . . . . . . . . . . . 7 1.1.4.Pre-campaign . . . . . . . . . . . . . . . . . . . . 13 1.1.5.Annotation . . . . . . . . . . . . . . . . . . . . . 17 1.1.6.Finalization . . . . . . . . . . . . . . . . . . . . . 21 1.2.Annotation complexity . . . . . . . . . . . . . . . . 24 1.2.1.Example overview . . . . . . . . . . . . . . . . . 25 1.2.2.What to annotate? . . . . . . . . . . . . . . . . . 28 1.2.3.How to annotate? . . . . . . . . . . . . . . . . . 30 1.2.4.The weight of the context . . . . . . . . . . . . 36 1.2.5.Visualization . . . . . . . . . . . . . . . . . . . . 38 1.2.6.Elementary annotation tasks . . . . . . . . . . 40 1.3.Annotation tools . . . . . . . . . . . . . . . . . . . . 43 1.3.1.To be or not to be an annotation tool . . . . . 43 1.3.2.Much more than prototypes . . . . . . . . . . . 46 vi CollaborativeAnnotationforReliableNaturalLanguageProcessing 1.3.3.Addressing the new annotation challenges . 49 1.3.4.The impossible dream tool . . . . . . . . . . . . 54 1.4.Evaluating the annotation quality . . . . . . . . 55 1.4.1.What is annotation quality? . . . . . . . . . . . 55 1.4.2.Understanding the basics . . . . . . . . . . . . 56 1.4.3.Beyond kappas . . . . . . . . . . . . . . . . . . . 63 1.4.4.Giving meaning to the metrics . . . . . . . . . 67 1.5.Conclusion . . . . . . . . . . . . . . . . . . . . . . . 75 Chapter 2. Crowdsourcing Annotation . . . . . . . . . . 77 2.1.What is crowdsourcing and why should we be interested in it? . . . . . . . . . . . . . . . . . . . . . . . 77 2.1.1.A moving target . . . . . . . . . . . . . . . . . . 77 2.1.2.A massive success . . . . . . . . . . . . . . . . . 80 2.2.Deconstructing the myths . . . . . . . . . . . . . . 81 2.2.1.Crowdsourcing is a recent phenomenon . . . 81 2.2.2.Crowdsourcing involves a crowd (of non-experts) . . . . . . . . . . . . . . . . . . . . . . 83 2.2.3.“Crowdsourcing involves (a crowd of) non-experts” . . . . . . . . . . . . . . . . . . . . . . . . 87 2.3.Playing with a purpose . . . . . . . . . . . . . . . . 93 2.3.1.Using the players’ innate capabilities and world knowledge . . . . . . . . . . . . . . . . . . . . . 94 2.3.2.Using the players’ school knowledge . . . . . 96 2.3.3.Using the players’ learning capacities . . . . 97 2.4.Acknowledging crowdsourcing specifics. . . . . . 101 2.4.1.Motivating the participants . . . . . . . . . . . 101 2.4.2.Producing quality data . . . . . . . . . . . . . . 107 2.5.Ethical issues . . . . . . . . . . . . . . . . . . . . . . 109 2.5.1.Game ethics. . . . . . . . . . . . . . . . . . . . . 109 2.5.2.What’s wrong with Amazon Mechanical Turk? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 2.5.3.A charter to rule them all . . . . . . . . . . . . 113 Contents vii Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

Description:
This book presents a unique opportunity for constructing a consistent image of collaborative manual annotation for Natural Language Processing (NLP).  NLP has witnessed two major evolutions in the past 25 years: firstly, the extraordinary success of machine learning, which is now, for better or for
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.