Table Of ContentMASTER’S THESIS | LUND UNIVERSITY 2017
News text generation with
adversarial deep learning
Filip Månsson, Fredrik Månsson
Department of Computer Science
Faculty of Engineering LTH
ISSN 1650-2884
LU-CS-EX 2017-18
News text generation with adversarial deep
learning
Filip Månsson Fredrik Månsson
tfy12fm1@gmail.com tfy12fma@gmail.com
September 6, 2017
Master’s thesis work carried out at Sony Mobile Communications AB.
Supervisors: HåkanJonsson,hakan1.jonsson@sonymobile.com
PierreNugues,pierre.nugues@cs.lth.se
Examiner: JacekMalec,jacek.malec@cs.lth.se
Abstract
Inthisworkwecarryoutathoroughanalysisofapplyingaspecificfieldwithin
machinelearningcalled generativeadversarialnetworks,totheartofnatu-
rallanguagegeneration;morespecificallywegeneratenewstextarticlesinan
automatedfashion. Todothis,weexperimentedwithafewdifferentarchitec-
turesandrepresentationsoftext,evaluatedtheresultsandusedtheinformation
retrieved from the results, to create a model that should give the best result.
For evaluation, we used perplexity and human evaluation. We also looked at
thetokendistributiontoseewhichmodelcapturesthetextsmostsuccessfully.
We show that it is possible to use generative adversarial networks to gen-
eratesequencesoftokensthatresemblenaturallanguage,butthisdoesnotyet
reach the quality of human-written text. Further hyperparameter tuning and
usinganarrower-subjectedcorpuscouldimprovetheoutput.
Keywords: Machine learning, generative adversarial learning, GAN, natural lan-
guagegeneration
2
Acknowledgements
Wewouldliketothankbothofoursupervisorsforhelpinguswiththisprojectandtaking
thetimeandefforttoanswerourquestionsaswellasprovidinguswithvaluablefeedback.
We also want to send our regards to our parents and siblings for their support throughout
ourlives.
3
4
Contents
1 Introduction 7
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 ProblemDefinition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Background 11
2.1 TextGeneration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 NeuralNetworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Convolutionalneuralnetworks . . . . . . . . . . . . . . . . . . . 12
2.2.2 Recurrentneuralnetworks . . . . . . . . . . . . . . . . . . . . . 12
2.2.3 Longshorttermmemory . . . . . . . . . . . . . . . . . . . . . . 13
2.2.4 Residuallearning . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 GenerativeAdversarialNetworks . . . . . . . . . . . . . . . . . . . . . . 13
2.3.1 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.2 Discriminator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.3 Costfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.5 Knownissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4 Wasserstein-GAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.4.1 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.2 Critic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.3 Costfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.5 Knownissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.5 ImprovedWasserstein-GAN . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.1 Costfunction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5.3 Knownissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 TextRepresentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5
CONTENTS
3 Approach 23
3.1 OverallApproach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.1 Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.2 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.1 Baseline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.2 GANmodel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.3 WGANmodels . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.4 Motivationofapproach . . . . . . . . . . . . . . . . . . . . . . . 29
4 Evaluation 31
4.1 MetricsUsed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.1 Perplexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.2 Humanevaluation . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.1 Resultsusingcharacters . . . . . . . . . . . . . . . . . . . . . . 33
4.2.2 Resultsusingwords . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.3 Generatedtext . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3 FinalModel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.4 HumanEvaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5 Conclusions 55
5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 FutureWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
AppendixA Generatedarticles 63
AppendixB Questionnaire 65
6
Description:Normally RNNs are capable of using context information i.e. predicting the next word given the .. The machine learning library of our choice was Keras with Tensorflow as backend for some curs in the middle of a sentence or not.