Specializing Generative Adversarial Networks To Render Terrain Textures HeinH.Aung March20,2018 Abstract IstudiedNeuralNetworksfortexturegeneration.Inparticular,IhaveresearchedandcreatedGenera- tiveAdversarialNetworks(GANs)forrenderingnewimagesofdifferenttypesofgroundterraintextures (suchassoil,grassplains,droughtplains,etc). AGANconsistsoftwoneuralnetworks: adiscriminator andagenerator. Thegeneratorcreateabatchofimageswithrandompixels. Ontheotherhand,thedis- criminatortriesdistinguishbetweentherealimageandfakeimagebytakingtheinputsfromasetofreal imagesandasetoffakeimagesfromthegenerator. TherehasbeenampleresearchdonewithGANsthat producesynthetic2Dimagesor3Dgraphicalmodels.TheGANIhavecreated,givenasetofinputimages, willdynamicallyrenderasetofdifferentvariationsoforiginalimagesofterraintextures. Forexample,if thetraininginputisasetofgreengrass,theoutputwillbeasetofwitheredgrassplain,freshgrassplain, andsoon. Isentoutasurveytothecampustoevaluatetheseimagesandmyfinalresultsindicatethat peoplecanidentifythegeneratedtexturestype. i Contents 1 Introduction 1 2 RelatedWork 2 3 Background 3 3.1 ImageClassification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.2 ConvolutionalNeuralNetwork(CNN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.3 GenerativeAdversarialNetwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4 Approach 8 5 Tools 8 6 MethodforEvaluation 9 7 ArchitectureandDesign 9 7.1 Discriminator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 7.2 Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 8 Data 12 9 Training 13 10 ResultsandAnalysisofEvaluations 14 11 Conclusion 22 11.1 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Appendices 24 A GANcode 24 ii List of Figures 1 No-Man’sSky.[3]Thegameusesproceduralgenerationofterrainsandtexturesofnon-player characters(NPCs)andallowtheplayertoexploredifferentplanets, solarsystems, andstar systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Intheimageaboveanimageclassificationmodeltakesasingleimageandassignsprobabil- itiesto4labels,cat,dog,hat,mug. [4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 AcartoonofNeuron[17] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4 Multi-layeredConvolutionNetwork. Left: ARegular3-layerNeuralNetwork. Right: How aneuronwouldlooklikeMathematically[13]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 5 GenerativeAdversarialNetwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 6 Googleformsforevaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 7 Dataset: crackedground,pebbles,pavement,andgrassground. . . . . . . . . . . . . . . . . . 13 8 Generatedimages: crackedground(lefttop),pebbles(righttop),grassground(leftbottom), andpavements(rightbottom) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 9 Lossvaluesofgeneratoranddiscriminatorwhiletraining. Weareonlyconsideringtheab- solutevalueofeachofthelossvalue. thenegativevalueisusedjustforvisualaid. Wewant twolinestogetclosertoeachother. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 10 Lossvaluesofdiscriminatorandgeneratoronthecategoryofpebbles. Asbothnetworksare trainedmoreandmore,thelossvaluesfortherealimagesandthefakeimagesbecomecloser together. Theresultingimagesontheleftlooksmorelikepebblethantherightimagewhich isbrighter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 11 Generatedgrass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 12 Generatedpavement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 13 Generatedpebbles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 14 Generatedpebbles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 iii List of Tables 1 Analyzedresponsetableofgrassground(Figure11) . . . . . . . . . . . . . . . . . . . . . . . . 18 2 AnalyzedresponsetableofPebbles(Figure13) . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3 Analyzedresponsetableofcrackedground(Figure14) . . . . . . . . . . . . . . . . . . . . . . 19 4 Analyzedresponsetableofpavement(Figure12) . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5 Scoresofeachimage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 iv 1 Introduction The concept of artificial neural network is basically introduced from the subject of biology where neural networksplayanimportantandkeyroleinhumancognition. Inhumanbodyworkisdonewiththehelp of neural network. Neural network is a web of inter connected neurons which are millions in number. Using these concepts, we have tried to create machines capable of not just logic and reasoning but also creativityandart. Sofar,themoststrikingsuccessesinneuralnetworkshaveinvolveddiscriminativenet- works [2], that classify a dimensional input to a class label [10]. These striking successes have primarily beenbasedonthebackpropagationalgorithm, amethodoftraininganeuralnetworkinwhichtheinitial systemoutputiscomparedtothedesiredoutput,andthesystemisadjusteduntilthedifferencebetween the two is minimized and dropout algorithms, a very efficient way of performing model averaging with neuralnetworks[11,7].Deepgenerativemodelshavehadlessofanimpact,duetothedifficultyofapprox- imating many unmanageable probabilistic computations that arise in the random estimation and related strategies. [9] These difficulties have been sidestepped by a new generative model estimation procedure, knownasGenerativeAdversarialNetworks(GAN),developedbyIanJ.Goodfellow. A generative adversarial network consists of two neural networks: generator and discriminator. The generatorstochasticallygeneratesanimage,andafteracertaintimeoftraining,theimagesgeneratedwill ideallybeindistinguishablefromthetrainingdatasetofimages. Sinceitisgenerallyinfeasibletoengineerafunctionthattellswhetherthatisthecase,adiscriminatornet- workistrainedtodotheassessment,andsincenetworksaredifferentiable,wealsogetagradient,theloss value(avaluethattellswhetherthediscriminatororthegeneratorisdoingagoodjob(closertozero)ora badjob(farfromzero)),wecanusetosteerbothnetworkstotherightdirection. As a fan in computer gaming, I find it quite amusing to encounter real world graphics, especially the realworldenvironmentingames. Engineeringordrawingsuchtexturestakestimesforartists. AsGANis filledwithnewartifactsforsynthesis,IgotmotivatedtodevelopGANthatsynthesizeterraintexturesthat mimic real world terrains. A texture can be represented as an image with repeated patterns. The goal of thisgroundterraintexturesynthesisistoinferageneratingprocessfromasetofexampletextures,which thenallowstogeneratearbitrarilymanynewsimilarsamplesofthattexture. Successinthistaskisjudged primarilybyvisualqualityandsimilaritytotheoriginaltextureasestimatedbyhumanobservers,butalso byothercriteriasuchasthespeedoflearning[15],abilitytogeneratedifferentvariationsoftheinputtex- ture. More applications of image rendering can be found in architecture, simulators, movies, and visual 1 effects. Sofar, terrainhavebeenprocedurallygeneratedthroughahostofalgorithmsdesignedtomimicreal- lifeterrain. Ingames,theterrainandthegameworldneedtobecreatedbythegamedesignersandartists. Thisgreatlylimitstheextenttowhichtheplayercanexperiencesincethehumandesignercanonlybuild the static game worlds to a limited extent. However, being able to train machine learning algorithms to learn terrain is going to allow the game to dynamically generate new areas. This allows for more varied game play experiences such as Figure 1, ”No-man sky” where the player has no restrictions or invisible walls and can travel extensively since the game world keep expanding. My research question is Can we specializeGenerativeAdversarialNetworktorenderterraintextures? In the body of this thesis, Section 2 talks about related work done with GANs and other methods for texture synthesis. Section 3 includes background of GAN, specifically about image classification, convo- lutionneuralnetworks,andadetaileddescriptionofGAN.Section4talksaboutmyapproach. Section5 liststhetoolsIusedandtheninthefollowingSection6talksaboutmethodologyformyevaluation. Sec- tion7beginstoexplainthealgorithmsofGANIimplemented,dataprocessingandthetrainingprocessof myGAN.Section8andSection9explainabouttheimages(generated)andtheevaluationoftheimages. Then, Section10analyzestheseevaluations. Lastly, Section11, discussestheissuesthatcomeupduring thisresearchwhichwillbeleftforfuturework. 2 Related Work TherehasbeenampleresearchdonewithGANsthatproducesynthetic2Dimagesor3Dgraphicalmodels. GrigoryAntipovetal. [1],proposedaGAN-basedmethodforautomaticfaceaging. Contrarytoprevious worksthatutilizeGANsforalteringoffacialattributes,theymadeanemphasizeonmaintainingtheorigi- nalperson’sidentityintheagedversionoftheperson’sface. Gatysetal. [6]presentamoredatadrivenparametricapproachtoallowgenerationofhighqualitytex- turesfromavarietyofnaturalimages. Usingfiltercorrelations,abasicoperationthatextractinformation fromimages,indifferentlayersoftheconvolutionnetworks,anoperationinartificialneuralnetworksthat is applied to analyze visual imagery, which is trained discriminatively on large natural image collections thatresultsinapowerfultechniquethatnicelycapturesexpressiveimagestatistics,classificationofimage toacertainclass. However,creatingasingleoutputtexturerequiressolvinganoptimizationproblemwith 2 iterativebackpropagation,whichiscostlyintimeandmemory. Therefore,inmythesis,Ihavedecidedto useapythonlibrary,Tensorflow,[15]whichsolvesthebackpropagation. Recent papers, Ulyanov’s [16] and Johnson’s[12] deal with that problem and train feed-forward con- volution networks in order to speed up the texture synthesis approach of [5]. Instead of doing the costly optimizationoftheoutputimagepixels, theyutilizepowerfuldeeplearningnetworksthataretrainedto produceimagesminimizingthelossvalue. Aseparatenetworkistrainedforeachtextureofinterestand canthenquicklycreateanimagewiththedesiredstatisticsinoneforwardpass. Indesigninggeneratoranddiscriminatornetworks,IhaveimplementedbothnetworksbasedonUlyanov’s andGatys’algorithmsbothofwhichincludesusingconvolutionnetworks. Thereareothertypesoftexturesynthesisaswell.Efros’paper[5]presentsasimpleimage-basedmethod of generating novel visual appearance in which a new image is synthesized by stitching together small patches of existing images. This process as they call it, Image Quilting. They first use quilting as a fast andverysimpletexturesynthesisalgorithmwhichproducessurprisinglygoodresultsforawiderangeof textures. Then they extend the algorithm to perform texture transfer, which is rendering an object with a texturetakenfromadifferentobject. 3 Background Inthissection,Iwillexplainbrieflyabouttheconceptsthatareusedformythesissuchasimageclassifica- tion,convolutionalneuralnetworks,andaformaldescriptionofgenerativeadversarialnetwork. 3.1 ImageClassification Images are 3-dimensional arrays of integers from 0 to 255, of size (Width x Height x 3). The 3 represents thethreecolorchannelsRed,Green,Blue(RGB).ThetaskinImageClassificationistopredictasinglelabel suchascats,dogs,shipsetc. foragivenimage. Sinceitisrelativelytrivialforahumantorecognizeavisual concept (e.g. cat), it might not be true for computers. So, it is worth considering the challenges involved fromtheperspectiveofaComputerVisionalgorithm. [4]Inmythesis,sinceIwillbeusingterraintextures astrainingdatasets,Iwillhavetomakesurethattheviewpointvariation,whichisasingleinstanceofan objectthatcanbeorientedinmanywayswithrespecttothecamera,shouldbesolved. 3 Figure 1: No-Man’s Sky.[3] The game uses procedural generation of terrains and textures of non-player characters(NPCs)andallowtheplayertoexploredifferentplanets,solarsystems,andstarsystems. 4 Figure2: Intheimageaboveanimageclassificationmodeltakesasingleimageandassignsprobabilities to4labels,cat,dog,hat,mug. [4] 3.2 ConvolutionalNeuralNetwork(CNN) Beforewetalkaboutconvolution,westartwiththedefinitionofneuralnetwork. Aneuralnetworkworks likeanetworkofneuroncellsinhumannervoussystemandtheyareallconnectedbysynapses.TheFigures 3and4,showabiologicalneuronandamathematicalformofaneuron. InFigure3,eachneuronreceives input signals from dendrites and produces output signals along the axon. The axon eventually branches out and connects through synapses to the other dendrites of other neurons. In the mathematical model of a neuron in Figure 4, the signals that travel along the axons such as x interact multiplicatively with 1 the dendrites w of other neurons to produce a signal w x . The idea is that the synaptic strengths w 1 1 1 1 are learned and control the strength of influence of one neuron on another. Moreover, they also decides whether w is positive(excitatory impulse) or negative(inhibitory impulse). In this model, the dendrites 1 carry the signal to the cell body where all the signals get summed. If the final sum is above a certain threshold,theneuroncanfire,sendinganimpulsealongitsaxon. Inthemathematicalmodel,theneuron 5 Figure3: AcartoonofNeuron[17] Figure 4: Multi-layered Convolution Network. Left: A Regular 3-layer Neural Network. Right: How a neuronwouldlooklikeMathematically[13]. firesitsimpulsesthroughanactivationfunctionwhicharediscussedindetailinalatersection. 3.3 GenerativeAdversarialNetwork AsproposedbyIanGoodfellow[8],agenerativeadversarialnetwork(GAN)consistsofageneratoranda discriminator,wherethediscriminatortriestoclassifyrealobjectsandobjectssynthesizedbythegenerator, and the generator attempts to confuse the discriminator. To build a GAN, we have to create two neural networks. Then we make them compete against each other, endlessly attempting to out-do one another. In the process, they both become better at what they do. A common analogy that is used to describe the discriminatorinGANsasabrandnewpoliceofficerwhoisbeingtrainedtodetectcounterfeitmoney. Its jobistolookatmoneyandreportifitisfakeorreal. Forthegenerator,itwillbeacounterfeiter,whowill 6
Description: