ebook img

News text generation with adversarial deep learning PDF

80 Pages·2017·2.65 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview News text generation with adversarial deep learning

MASTER’S THESIS | LUND UNIVERSITY 2017 News text generation with adversarial deep learning Filip Månsson, Fredrik Månsson Department of Computer Science Faculty of Engineering LTH ISSN 1650-2884 LU-CS-EX 2017-18 News text generation with adversarial deep learning Filip Månsson Fredrik Månsson [email protected] [email protected] September 6, 2017 Master’s thesis work carried out at Sony Mobile Communications AB. Supervisors: HåkanJonsson,[email protected] PierreNugues,[email protected] Examiner: JacekMalec,[email protected] Abstract Inthisworkwecarryoutathoroughanalysisofapplyingaspecificfieldwithin machinelearningcalled generativeadversarialnetworks,totheartofnatu- rallanguagegeneration;morespecificallywegeneratenewstextarticlesinan automatedfashion. Todothis,weexperimentedwithafewdifferentarchitec- turesandrepresentationsoftext,evaluatedtheresultsandusedtheinformation retrieved from the results, to create a model that should give the best result. For evaluation, we used perplexity and human evaluation. We also looked at thetokendistributiontoseewhichmodelcapturesthetextsmostsuccessfully. We show that it is possible to use generative adversarial networks to gen- eratesequencesoftokensthatresemblenaturallanguage,butthisdoesnotyet reach the quality of human-written text. Further hyperparameter tuning and usinganarrower-subjectedcorpuscouldimprovetheoutput. Keywords: Machine learning, generative adversarial learning, GAN, natural lan- guagegeneration 2 Acknowledgements Wewouldliketothankbothofoursupervisorsforhelpinguswiththisprojectandtaking thetimeandefforttoanswerourquestionsaswellasprovidinguswithvaluablefeedback. We also want to send our regards to our parents and siblings for their support throughout ourlives. 3 4 Contents 1 Introduction 7 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2 ProblemDefinition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2 Background 11 2.1 TextGeneration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 NeuralNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Convolutionalneuralnetworks . . . . . . . . . . . . . . . . . . . 12 2.2.2 Recurrentneuralnetworks . . . . . . . . . . . . . . . . . . . . . 12 2.2.3 Longshorttermmemory . . . . . . . . . . . . . . . . . . . . . . 13 2.2.4 Residuallearning . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 GenerativeAdversarialNetworks . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.2 Discriminator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.3 Costfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.5 Knownissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Wasserstein-GAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.1 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.2 Critic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.3 Costfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4.5 Knownissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.5 ImprovedWasserstein-GAN . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.1 Costfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5.3 Knownissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.6 TextRepresentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5 CONTENTS 3 Approach 23 3.1 OverallApproach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.1 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.2 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.1 Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.2 GANmodel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.3 WGANmodels . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.4 Motivationofapproach . . . . . . . . . . . . . . . . . . . . . . . 29 4 Evaluation 31 4.1 MetricsUsed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1.1 Perplexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1.2 Humanevaluation . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2.1 Resultsusingcharacters . . . . . . . . . . . . . . . . . . . . . . 33 4.2.2 Resultsusingwords . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.3 Generatedtext . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 FinalModel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.4 HumanEvaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5 Conclusions 55 5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.2 FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 AppendixA Generatedarticles 63 AppendixB Questionnaire 65 6

Description:
Normally RNNs are capable of using context information i.e. predicting the next word given the .. The machine learning library of our choice was Keras with Tensorflow as backend for some curs in the middle of a sentence or not.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.