MASTER’S THESIS | LUND UNIVERSITY 2017 News text generation with adversarial deep learning Filip Månsson, Fredrik Månsson Department of Computer Science Faculty of Engineering LTH ISSN 1650-2884 LU-CS-EX 2017-18 News text generation with adversarial deep learning Filip Månsson Fredrik Månsson [email protected] [email protected] September 6, 2017 Master’s thesis work carried out at Sony Mobile Communications AB. Supervisors: HåkanJonsson,[email protected] PierreNugues,[email protected] Examiner: JacekMalec,[email protected] Abstract Inthisworkwecarryoutathoroughanalysisofapplyingaspecificfieldwithin machinelearningcalled generativeadversarialnetworks,totheartofnatu- rallanguagegeneration;morespecificallywegeneratenewstextarticlesinan automatedfashion. Todothis,weexperimentedwithafewdifferentarchitec- turesandrepresentationsoftext,evaluatedtheresultsandusedtheinformation retrieved from the results, to create a model that should give the best result. For evaluation, we used perplexity and human evaluation. We also looked at thetokendistributiontoseewhichmodelcapturesthetextsmostsuccessfully. We show that it is possible to use generative adversarial networks to gen- eratesequencesoftokensthatresemblenaturallanguage,butthisdoesnotyet reach the quality of human-written text. Further hyperparameter tuning and usinganarrower-subjectedcorpuscouldimprovetheoutput. Keywords: Machine learning, generative adversarial learning, GAN, natural lan- guagegeneration 2 Acknowledgements Wewouldliketothankbothofoursupervisorsforhelpinguswiththisprojectandtaking thetimeandefforttoanswerourquestionsaswellasprovidinguswithvaluablefeedback. We also want to send our regards to our parents and siblings for their support throughout ourlives. 3 4 Contents 1 Introduction 7 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2 ProblemDefinition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.3 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2 Background 11 2.1 TextGeneration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 NeuralNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Convolutionalneuralnetworks . . . . . . . . . . . . . . . . . . . 12 2.2.2 Recurrentneuralnetworks . . . . . . . . . . . . . . . . . . . . . 12 2.2.3 Longshorttermmemory . . . . . . . . . . . . . . . . . . . . . . 13 2.2.4 Residuallearning . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 GenerativeAdversarialNetworks . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.2 Discriminator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.3 Costfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.3.5 Knownissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Wasserstein-GAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.1 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.2 Critic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.3 Costfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.4.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.4.5 Knownissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.5 ImprovedWasserstein-GAN . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.1 Costfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.5.3 Knownissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.6 TextRepresentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5 CONTENTS 3 Approach 23 3.1 OverallApproach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.1 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.2 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.1 Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.2 GANmodel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.3 WGANmodels . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3.4 Motivationofapproach . . . . . . . . . . . . . . . . . . . . . . . 29 4 Evaluation 31 4.1 MetricsUsed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1.1 Perplexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.1.2 Humanevaluation . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.2.1 Resultsusingcharacters . . . . . . . . . . . . . . . . . . . . . . 33 4.2.2 Resultsusingwords . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.3 Generatedtext . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.3 FinalModel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.4 HumanEvaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5 Conclusions 55 5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 5.2 FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 AppendixA Generatedarticles 63 AppendixB Questionnaire 65 6
Description: