ebook img

Deep Learning for Natural Language Processing PDF

296 Pages·2022·8.491 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Deep Learning for Natural Language Processing

Stephan Raaijmakers M A N N I N G Deep Learning for Natural Language Processing Deep Learning for Natural Language Processing STEPHAN RAAIJMAKERS MAN NING SHELTER ISLAND For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. 20 Baldwin Road PO Box 761 Shelter Island, NY 11964 Email: [email protected] ©2022 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15 percent recycled and processed without the use of elemental chlorine. The author and publisher have made every effort to ensure that the information in this book was correct at press time. The author and publisher do not assume and hereby disclaim any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause, or from any usage of the information herein. Manning Publications Co. Development editor: Dustin Archibald 20 Baldwin Road Technical development editors: Michiel Trimpe and Al Krinker PO Box 761 Review editor: Ivan Martinović Shelter Island, NY 11964 Production editor: Keri Hales Copy editor: Tiffany Taylor Proofreader: Katie Tennant Technical proofreader: Mayur Patil Typesetter and cover designer: Marija Tudor ISBN 9781617295447 Printed in the United States of America brief contents PART 1 INTRODUCTION .............................................................. 1 1 ■ Deep learning for NLP 3 2 ■ Deep learning and language: The basics 31 3 ■ Text embeddings 52 PART 2 DEEP NLP .................................................................... 87 4 ■ Textual similarity 89 5 ■ Sequential NLP 112 6 ■ Episodic memory for NLP 140 PART 3 ADVANCED TOPICS ...................................................... 161 7 ■ Attention 163 8 ■ Multitask learning 190 9 ■ Transformers 219 10 ■ Applications of Transformers: Hands-on with BERT 243 v contents preface x acknowledgments xii about this book xiii about the author xvi about the cover illustration xvii PART 1 INTRODUCTION ............................................... 1 1 Deep learning for NLP 3 1.1 A selection of machine learning methods for NLP 4 The perceptron 6 ■ Support vector machines 9 Memory-based learning 12 1.2 Deep learning 13 1.3 Vector representations of language 21 Representational vectors 22 ■ Operational vectors 25 1.4 Vector sanitization 28 The hashing trick 28 ■ Vector normalization 29 2 Deep learning and language: The basics 31 2.1 Basic architectures of deep learning 31 Deep multilayer perceptrons 32 ■ Two basic operators: Spatial and temporal 35 2.2 Deep learning and NLP: A new paradigm 50 vi CONTENTS vii 3 Text embeddings 52 3.1 Embeddings 52 Embedding by direct computation: Representational embeddings 53 ■ Learning to embed: Procedural embeddings 55 3.2 From words to vectors: Word2Vec 64 3.3 From documents to vectors: Doc2Vec 76 PART 2 DEEP NLP ..................................................... 87 4 Textual similarity 89 4.1 The problem 90 4.2 The data 90 Authorship attribution and verification data 91 4.3 Data representation 92 Segmenting documents 93 ■ Word-level information 94 Subword-level information 98 4.4 Models for measuring similarity 100 Authorship attribution 101 ■ Verifying authorship 106 5 Sequential NLP 112 5.1 Memory and language 113 The problem: Question Answering 113 5.2 Data and data processing 114 5.3 Question Answering with sequential models 120 RNNs for Question Answering 121 ■ LSTMs for Question Answering 127 ■ End-to-end memory networks for Question Answering 132 6 Episodic memory for NLP 140 6.1 Memory networks for sequential NLP 140 6.2 Data and data processing 143 PP-attachment data 144 ■ Dutch diminutive data 145 Spanish part-of-speech data 147 6.3 Strongly supervised memory networks: Experiments and results 149 viii CONTENTS PP-attachment 149 ■ Dutch diminutives 150 ■ Spanish part-of-speech tagging 150 6.4 Semi-supervised memory networks 151 Semi-supervised memory networks: Experiments and results 157 PART 3 ADVANCED TOPICS ....................................... 161 7 Attention 163 7.1 Neural attention 163 7.2 Data 167 7.3 Static attention: MLP 168 7.4 Temporal attention: LSTM 174 7.5 Experiments 183 MLP 184 ■ LSTM 187 8 Multitask learning 190 8.1 Introduction to multitask learning 190 8.2 Multitask learning 192 8.3 Multitask learning for consumer reviews: Yelp and Amazon 193 Data handling 194 ■ Hard parameter sharing 197 Soft parameter sharing 199 ■ Mixed parameter sharing 201 8.4 Multitask learning for Reuters topic classification 202 Data handling 203 ■ Hard parameter sharing 206 Soft parameter sharing 207 ■ Mixed parameter sharing 208 8.5 Multitask learning for part-of-speech tagging and named- entity recognition 209 Data handling 210 ■ Hard parameter sharing 214 Soft parameter sharing 215 ■ Mixed parameter sharing 216 9 Transformers 219 9.1 BERT up close: Transformers 220 9.2 Transformer encoders 223 Positional encoding 226 9.3 Transformer decoders 231 9.4 BERT: Masked language modeling 234 Training BERT 235 ■ Fine-tuning BERT 238 ■ Beyond BERT 239 CONTENTS ix 10 Applications of Transformers: Hands-on with BERT 243 10.1 Introduction: Working with BERT in practice 244 10.2 A BERT layer 245 10.3 Training BERT on your data 248 10.4 Fine-tuning BERT 255 10.5 Inspecting BERT 258 Homonyms in BERT 259 10.6 Applying BERT 262 bibliography 265 index 269

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.