ebook img

Identifying polymer states by machine learning PDF

0.44 MB·
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Identifying polymer states by machine learning

Identifying polymer states by machine learning Qianshi Wei,1 Roger G. Melko,1,2 and Jeff Z. Y. Chen∗1 1Department of Physics and Astronomy, University of Waterloo, Waterloo N2L 3G1, Canada 2Perimeter Institute for Theoretical Physics, Waterloo, Ontario N2L 2Y5, Canada (Dated: January 17, 2017) The ability of a feed-forward neural network to learn and classify different states of polymer configurations is systematically explored. Performing numerical experiments, we findthat a simple network model can, after adequate training, recognize multiple structures, including gas-like coil, liquid-likeglobular,andcrystallineanti-MackayandMackaystructures. Thenetworkcanbetrained 7 to identify the transition points between various states, which compare well with those identified 1 byindependentspecific-heatcalculations. Ourstudydemonstratesthatneuralnetworkprovidesan 0 unconventionaltool to study thephase transitions in polymeric systems. 2 n PACSnumbers: 61.25.H-,05.70.Fh,05.10.-a,36.40.Ei a J 6 Introduction.– Machine learning is an active research 1 branch of computer science, which has vastly matured in recent years [1–6]. Artificial neural networks (NN) ] t areacomputationalimplementationofmachine learning f o that have demonstrated surprising capability in recog- s nizing patterns of enormous complexity, after appropri- . t ately trained by human or self-trained through learning a mechanisms[7–13]. Originallymotivatedbythedesireto m establish an algorithmic model of the neuronal configu- - d ration of a mammalian brain, one finds that an artificial n NN with a very simple underlying structure can already o successfully perform many complex tasks. For example, c FIG. 1: Sketch of a fully connected feed-forward artificial simple NN models have shown transformative success in [ neuralnetwork. Thecirclesrepresentneuronnodesconnected hand-writing and speech recognitions [14–17]. layer bylayer via various weight and activation parameters. 1 The application of machine learning in computer sim- v 0 ulationsofmolecularsystemshasonlyrecentlyemerged. 9 Machine learninghasbeen proposedas a methodfor ob- structures. The physical system we discuss has a long 3 taining approximate solutions for the equations of den- history, hence is used as a typical example of a classical 4 sity functional theory [18]; for constructing force-fields system displaying phase transitions that can be conve- 0 for molecular dynamics [19]; and for providing an in- niently studied by using a hybridization of conventional . 1 expensive impurity solver for dynamical mean-field the- simulation and machine learning techniques. 0 ory [20]. More recently, a simple supervised NN was im- The feed-forward neural network.– A standard feed- 7 1 plemented to classify phases in a two-dimensional (2D) forwardNN for supervisedlearning consists of three lay- : spinmodelandto identify criticaltemperaturesbetween ers of neurons or “nodes”, input (i), hidden (h), and out- v various phases directly from configurations produced by put (o), eachcontainingN, N , N neurons [see Fig. 1]. i i h o X Monte Carlo (MC) simulations [21]. This comes on the These nodes are connected thourgh edges (representing r crestofawaveofworkexploringthehybridizationofMC modelweights),formingafully-connectedgraphbetween a andmachinelearning[22–25],andthedeeperconnections (but not among) each neuron layer. In a typical image between techniques such as deep learning to theoretical recognition application, a 2D picture is discretized into concepts familiar from statistical mechanics, such as the N pixels and the values of pixel’s intensity are then fed i renormalizationgroup [26, 27]. into the input layer for either training or testing pur- poses. The configurations of a 2D Ising model, where Here, we demonstrate the capability of a standard N spins havebinary values,share some similarityto the NNmodelinrecognizingvariousconfigurationsproduced i standard image recognition problem, as each configura- fromMC simulationsofpolymer models,identifying dis- tion is a “snapshot” of a 2D pattern of pixels. However, ordered, partially-ordered, and ordered states such as inasimplefeedforwardNN,thedetailsofthisdimension coil, liquid-like globular, and two low-energy crystalline and locality are lost as pixels are simply sent into input nodes as a one-dimensional vector [21]. A simulated, three-dimensional (3D) polymer config- ∗jeff[email protected] uration is mathematically represented by its 3N spatial 2 tential: below a square distance of 0.81a2 the monomers repel through the excluded-volume interaction, and be- tween 0.81a2 and 2a2 the monomers experience an at- tractionofmagnitude −ǫ. Within the secondmodel, the bonded monomers are modeled by a particular imple- mentation of the Finitely Extensible Nonlinear Elastic (FENE) model, and the monomers interact with each other in the form of the Lennard-Jones (LJ) potential with a potential well-depth −ǫ. The selection of FENE andLJfunctionsexactlymatchesthoseusedinRef. [31]. InthisLetterwerefertothefirstmodelasGSMandsec- ond as FENE for simplicity. More details are defined in the supplemental material [33]. Coil-to-globule transition.– MC simulations which in- corporate the Boltzmann weight were used for GSM to FIG. 2: Typical configurations of a polymer in (a) coil, (b) produce5×103independentconfigurationsateveryspec- globular,(c)anti-Mackay,and(d)Mackaystates,asthetem- ifiedtemperaturek T/ǫwherek istheBoltzmanncon- perature is lowered. The N =102 monomers are represented B B by blue spheres, except for those in the last two plots. At stant. Assessing the data shown in Fig. 3(a) for both the liquid-like globular state, the monomer positions are dis- reduced mean-square radius of gyration, S2 ≡ hR2i/a2, g ordered, shown in (e). Both anti-Mackay and Mackay states and mean-square deviation of the total energy from are crystalline, differ from each other bythemonomer stack- its average (which is proportional to the specific heat), ing symmetries, demonstrated by the red and yellow spheres C˜ ≡[hE2 i−hE i2]/N2ǫ2,weobservethatacoil(C)-to- in (f) and (g). int int globule(G) phase transition takes place at kBT/ǫ ≈ 2.0, corresponding to the location of the peak in C˜. To demonstrate the capability of NN in recognizing C coordinates of the N connected monomers. In order to andG,weperformedthreenumericalexperiments. First, proceed, one must decide how this information is fed 3×103 polymer configurations (specified by the coordi- into the NN. One option is to naively adopt the image nates ofmonomersaftersetting a=1)ateveryspecified recognitionidea above,by digitizing the 3D space into a temperaturewithintheranges[0.5,1.5]and[2.5,3.5]were grid and assigning a value of “occupied” or “unoccupied” selected as training sets for the G and C states, respec- to a particular grid point, describing the occupancy of tively. The two normalized output neurons were desig- monomers. The entire pattern would then be used for nated as the G and C labels, which during training were analysis. Alternatively,hereweanalyzeapolymerconfig- assignedtohavevaluesν =1forthecorrespondingstate, urationbydirectlyfeeding 3N coordinatesinto Ni =3N andν =0otherwise. Nootherinformationorestimators input nodes. Then, to implement supervised learning, typical of MC simulations, such as S2 or C˜ were used in the NN is trained to analyze the relationship between the training. Once the NN was adequately trained and the monomer position vector and a configurational pat- allNNparameterswerefixed, we input500new configu- tern corresponding to the “label” in the output layer. In rationsateverytemperaturebetweentherange[0.5,5.5], total Nh = 100 neurons are used and the number of the which were not used in the training session, as the test- outputnodes,No issetto2or3dependingonthephysi- ing set. The averagedtest values of the two output neu- calproblemsdescribedbelow. Asatechnicalnote,during rons, ν, are plotted in Fig 3(b) forming two curves be- thetrainingsession,weadoptedcrossentropyasthecost hindthefilledandopensquares. Thesecurvescrosseach function and used 50% dropout rate in determination of other, identifying a C-to-G transition at kBT/ǫ = 2.03, the parameters related to the hidden layer, which is a in agreementwith the locationof the C˜-peak, regardless regularizationmethod to avoid overfitting [28]. ofthefactthattheNNmodelwastrainedintemperature Polymer states.– Two polymer models are used here, ranges farther away from the transition point. both consisting of N = 102 connected monomers in- The C and G states have distinguishably different di- teracting through a typical short-range hard-core repul- mensions, represented by S2. To show that NN is not sion and a slightly-longer-range attraction. One of the simply recognizing C and G from their different over- polymer models is known to exhibit four different states all sizes, in our second experiment we normalized all co- illustrated in Fig. 2, separated by phase transitions of ordinates, of the training and testing sets, by a factor finite-sizecharacteristics[29–31]. Withinthefirstmodel, 1/S. On average, the polymer configurations now have the bonded monomers interact with each other by a the same normalizedradius of gyration(=1), acrossthe springpotential,identicaltotheoneusedintheGaussian entirestudiedtemperaturerange. The networkwasthen model(GSM)withaKuhnlengtha[32];twonon-bonded trained in a similar manner described above, with the monomers interact with each other by a square-well po- normalized coordinates. The quality of output neurons 3 S2 globule coil C~ ν globule coil C~ ν globule coil C~ ν globule coil γ 170 0.08 1.0 0.08 1.0 0.08 1.0 -0.2 160 150 0.07 0.9 0.07 0.9 0.07 0.9 140 -0.3 0.8 0.8 0.8 130 0.06 0.06 0.06 120 0.7 0.7 0.7 110 -0.4 0.05 0.05 0.05 100 0.6 0.6 0.6 (a) (b) (c) (d) 90 0.04 0.5 0.04 0.5 0.04 0.5 -0.5 80 70 0.4 0.4 0.4 0.03 0.03 0.03 60 -0.6 50 0.3 0.3 0.3 40 0.02 0.02 0.02 0.2 0.2 0.2 30 -0.7 20 0.01 0.1 0.01 0.1 0.01 0.1 10 0 0.00 0.0 0.00 0.0 0.00 0.0 -0.8 1.0 2.0 3.0 4.0 5.0 1.0 2.0 3.0 4.0 5.0 1.0 2.0 3.0 4.0 5.0 -2.6 -2.2 -1.8 -1.4 -1.0 kBT/ε kBT/ε kBT/ε e=E/Nε FIG. 3: (a) Reduced mean-square radius of gyration (plus symbols, to the left scale) and specific heat (circles, to the right scale) as functions of thereduced temperature kBT/ǫ, determined from the MC simulations of GSM for N =102, (b) the NN outputs of test-recognizing independent GSM configurations, (c) the NN outputs of test-recognizing independent, normalized GSMconfigurations,and(d)theNNoutputsoftest-recognizingnormalizedFENEconfigurations. Thefilledandopensquares represent the mean ν-values output from the globular and coil neurons, respectively. The circles in plot (d) represent the specific-heat-like γ (to the right scale) determined for the FENE model from a Wang-Landau MC simulation. Error bars associated with all squares are smaller than the symbolsize. to indicate the C and G states for the testing data is 1.0 Mackay anti-Mackay globule 5 equally good as in the previous case, shown in Fig. 3(c). 0.9 0 0.8 While these numericalexperiments wereperformedby 0.7 -5 0.6 using the configurational data generated from GSM, we ν 0.5 -10 γ placed the network into the ultimate test in the third 0.4 numerical experiment. This time, the NN parameters 0.3 -15 determined in the second experiment were retained and 0.2 -20 weaskedthenetworktorecognizetheconfigurationsgen- 0.1 eratedfromtheFENEmodel,inwhichcompletelydiffer- 0.0 -25 -4.9 -4.8 -4.7 -4.6 -4.5 -4.4 -4.3 -4.2 entpotentialfunctions wereusedthaninthe square-well e=E/Nε GSM. Furthermore, instead of producing the configura- tionsfromthecanonicalensemblewherekBT/ǫisusedas FIG. 4: Mean NN outputs ν (square for globule, open dia- the system parameter, we generated configurations from mondsforanti-Mackay,andfilleddiamondsforMackay)from the test samples, after the network is trained to recognizing theWang-Landau(WL)algorithmformicrocanonicalen- these states in regimes where they are stable. In the back- sembles,in which the total (reduced)energy per particle ground, the reduced specific heat-like γ (circles, to the right e = E/Nǫ is directly used as the system parameter. At scale)wasindependentlyproducedfromtheMCsimulations. eache-valueillustratedinFig. 3(d), 1×103 independent Error bars are smaller than the symbolsize. FENE configurations, normalized by their correspond- ing S,weresentto the GSM-trainednetworkfortesting. The two recognition curves produced from the output Low-energy polymer states.– It is well-known that neurons, shown in the figure as the underlying curves the FENE model exhibits three different states in behind filled and open squares,predicate a C-to-G tran- the low-energy region, globule, anti-Mackey (aM), and sitionate=−1.8. Thispredictioncanbe independently Mackey (M), reported by careful MC studies utilizing verified by examining the γ(e) curve, which is a specific- the WL algorithm which uses e as the system parame- heat-type measurement in reduced units defined in the ter [30, 31]. One could convert all languages to a low- microcanonical ensemble [31]. The data represented by temperature description but we stay here with the low- circles in Fig. 3(d) was calculated based on the density energy description. These three structures have subtle of states determined from the WL algorithm; it shows structural differences which cannot be distinguished by a peak at the same location as the one successfully pre- directvisualizationinFigs. 2(b),(c)and(d). Inparticu- dicted by GSM-trained NN. lar, the crystalline aM and M differ delicately from each 4 other by the way that monomers are stacked [Figs. 2(f) the approximate location of the transition temperature and (g)]. can be already estimated by using early training ranges As a final numerical experiment, we challenge the NN far away from the transition point and that the more to recognizethese three states, using three neuronnodes precise determination can be achieved by progressively intheoutputlayer,eachassignedtorecognizeG,aMand adding configurationscloserto the transitionpoint. The M separately. The networkwas trainedwith FENE con- procedureactuallyrevealsamechanismoffindingaphase figurations in the energy range e = [−4.3,−4.16], [-4.7,- transition point, without a priori knowledge of its exis- 4.5], and [-4.9,-4.8], where the G, aM and M structures, tence,bytakingtwosmalltrainingrangesasthestarting respectively, can be clearly defined. The training data point and proposing a phase transition point somewhere contained 3×103 configurations at every energy bin. in between. After the network was adequately trained, it was Toanswerthesecondquestion,weconductanumerical tasked to recognize an independent set of test data cov- experiment on a NN with GSM MC data in the reduced ering the entire energy range in Fig 4. Over 103 config- temperaturerange[2.5,5.5]. Withinthiscoilregion,both uration samples were used at every energy bin for this S2 and averageenergy (notshown)have significantvari- purpose. The mean ν-values ofthe testoutput, fromthe ations. We enforcedly train the NN so that the two out- G, aM, and M nodes, are represented in Fig 4 by filled putneuronnodesmistakenlyregardconfigurationsinthe squares,open diamonds,and filled diamonds. The inter- range [2.5,3.5] as in phase-1 and [4.5,5.5] as in phase-2. sections of the interpolated curves predict that G-to-aM We then test the network with independent configura- and aM-to-M transitions take place at e = −4.40 and tions over the entire [2.5,5.5] range. The results of the e = −4.74, respectively. These NN predicted transition outputnodesareplottedinFig. 5(b). Eachnodevaguely pointscanbeconfirmedbyanexaminationofγ,indepen- recognizes the configurations as “phase-1” or “phase-2”, dently calculated from the WL MC simulations. From with a mean ν-value hovering around 0.5 in high uncer- the γ-peaks in the plot, we determine that the G-to-aM tainties. Noclearsignals,suchasthosedeterminedabove and aM-to-M transition points are at e = −4.40±0.03 for true phase transitions, exist. ande=−4.74±0.03respectively,whichagreewiththose Summary.– We described the training of neural net- from the NN predictions. works to recognize diversely and subtly different poly- In search of a phase transition.– The above numerical merstatesproducedfromMonteCarlosimulations. One experiments demonstrated the NN’s versatility in recog- advantageofthisapproachisdirectlysendingtheconfig- nizingpolymerconfigurationsandtheusefulnessofNNin uration data represented by molecular coordinates to a determinationofthe transitionpoints. Therearetwoes- NN,without definingorderparametersorcalculatingthe sentialquestionswehavenotadequatelyaddressed. How heatcapacity;these areconventionallyusedincomputer does the NN predicted location of the transition depend simulationstorigorouslydetermineatransitionpoint. In on the range of used training data? Would the network particular, in the low-energyregime, a simulated system mistakenlyidentifyaphasetransitionforasystemwhere oftenencounterspotential-energytrapswhichneedtobe a certain physical property smoothly crosses over from treated by using a non-Boltzmann weight. Taking the relatively large to small values without going through a Wang-Landau algorithm as an example, the calculation real phase transition? of the heat capacity in the extremely low-energy regime To answer the first question we return to the GSM requires high numerical precision of the computed den- MC data over a wide range of the reduced temperature, sity of states, which is achievable but requires extensive [0.5,3.5]. We conducteda seriesof18independenttrain- computations. The NN process describe here, on the ing sessions, each assigned a different training tempera- other hand, does not require such precision, as long as ture range, away from the C-to-G transition point. The independent configurations used for supervised training firstsessiontreatedMCconfigurationsproducedfromthe are produced by a numerical simulation. systemattwotemperaturevalues,oneontheextremeleft The example used here is a classical molecular sys- andtheotherextremeright,wheretheC-to-Gtransition tem displaying gas-, liquid-, and crystal-like structures sits in the middle. In the subsequent sessions the train- at various energies. We demonstrated that NN can clas- ing temperature ranges expanded from the extreme left sifybothfirst-(globuletoanti-Mackay)andsecond-order to the center and extreme right to the center, incremen- (coil-to-globuleandanti-MacKaytoMackay)transitions. tally. Thesetrainednetworkswerethenput intotest, by Thedirectuseofmolecularcoordinatesasinput intothe inputting independent configurations over the extended NN underlies the robustness and simplicity of our ap- [0.5,5.5]-range. The mean output ν-values from the C proach,andsuggeststhatothersimulationtools,suchas andGnodes areillustratedinFig 5(a)for afew selected moleculardynamics,couldbeusedaswell. Theoutcome sessions. AsthetrainingrangeexpandstheNN-predicted of this work provides a compelling reasonto incorporate transition temperature converges to a fixed point; the machine learning techniques into molecular simulations final converging temperature agrees with the one deter- more generally, as a powerful hybridized computational minedbytheMCC˜-peak,2.03. Thisstudysuggeststhat tool for the future study of polymeric systems. 5 [4] R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, 1.0 0.08 Machine learning: An artificial intelligence approach 0.9 22..0004 0.07 (Springer Science & Business Media, 2013). 00..78 εT/B 11..9926 0.06 [5] K.P.Murphy,Machine learning: a probabilistic perspec- 0.6 k 11..8848 0.05 tive (MIT press, 2012). ν 0.5 (a) 0.0 0.4 0.8 1.2 0.04 ~C [6] M. Jordan and T. Mitchell, Science 349, 255 (2015). 0.4 Range 0.03 [7] K. Fukushima,Neural Net 1, 119 (1988). 0.3 [8] H. S. Seung, H. Sompolinsky, and N. Tishby, 0.02 0.2 Phys. Rev.A 45, 6056 (1992). 0.1 0.01 [9] R.Parekh,J.Yang, andV.Honavar,IEEETrans.Neural 0.0 0.00 Netw. 11, 436 (2000). 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 k T/ε [10] A. K. Jain, R. P. W. Duin, and J. Mao, IEEE Trans. B Pattern Anal. Mach. Intell. 22, 4 (2000). 1.0 [11] C. Davatzikos, K. Ruparel, Y. Fan, D. Shen, 0.9 M. Acharyya, J. Loughead, R. Gur, and D. D. Lan- 0.8 gleben, Neuroimage 28, 663 (2005). 0.7 [12] C. Bishop, Pattern Recognition and Machine Learning 0.6 (Springer, New York,2007). ν 0.5 [13] J. Schmidhuber,Neural Net 61, 85 (2015). 0.4 [14] Y.LeCun,L.Jackel, L.Bottou, C.Cortes, J.S.Denker, 0.3 0.2 H.Drucker,I.Guyon,U.Muller,E.Sackinger,P.Simard, 0.1 (b) et al., Neural networks: the statistical mechanics per- 0.0 spective 261, 276 (1995). 2.5 3.0 3.5 4.0 4.5 5.0 5.5 [15] M.A.Nielsen,URL:http://neuralnetworksanddeeplearning. kBT/ε com/.(visited: 01.11. 2014) (2015). [16] R. P. Lippmann,Neural Comput 1, 1 (1989). [17] G. Hinton,L.Deng, D.Yu,G. E. Dahl,A.-r.Mohamed, FIG. 5: (a) Average NN output ν from the globule and coil N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. neurons on the testing configurations for NN models trained Sainath,etal.,IEEESignalProcess.Mag.29,82(2012). in various temperature ranges and (b) average NN output ν [18] J. C. Snyder, M. Rupp, K. Hansen, K.-R. Müller, and from“phase-1”-(squares)and“phase-2”-nodes(circles)onthe K. Burke, Phys.Rev.Lett. 108, 253002 (2012). test configurations in thetemperaturerange [2.5,5.5] for NN [19] J. Behler and M. Parrinello, models trained by using low- and high-temperature samples. Phys. Rev.Lett. 98, 146401 (2007). The inset in plot (a) is the NN-predicted transition temper- [20] L.-F.Arsenault,A.Lopez-Bezanilla,O.A.vonLilienfeld, ature as a function of the training temperature-range used. and A.J. Millis, Phys. Rev.B 90, 155136 (2014). Error bars in (a) are smaller than thesymbolsize. Theplus, [21] J. Carrasquilla and R. G. Melko, arXiv preprint cross,andsquaresymbolsrepresentthemeanν fromthetwo arXiv:1605.01735 (2016). outputneuronsontestsamples,ofNNmodelsinitiallytrained [22] G. Torlai and R. G. Melko, in the temperature range, Range = 0.00,0.53, and 1.20, re- Phys. Rev.B 94, 165134 (2016). spectively. The circles in (a) represent the same data in Fig. [23] L. Huang and L. Wang, (2016), arXiv:1610.02746 . 3(a). Thetemperaturerangein(b)containsnoactualphase- [24] J. Liu, Y. Qi, Z. Y. Meng, and L. Fu, (2016), transition point. arXiv:1610.03137 . [25] N.Portmanand I.Tamblyn, (2016), arXiv:1611.05891 . [26] P. Mehta and D. J. Schwab, arXiv preprint Acknowledgement. –FinancialsupportfromtheNatu- arXiv:1410.3831 (2014). ralSciencesandEngineeringCouncilofCanadaisgrate- [27] H. W. Lin and M. Tegmark, (2016), arXiv:1608.08225 . fully acknowledged. This workwasmade possibleby the [28] N. Srivastava, G. Hinton, A. Krizhevsky, facilities of the Shared Hierarchical Academic Research I. Sutskever, and R. Salakhutdinov, J. Mach. Learn. Res. 15, 1929 (2014). Computing Network and Compute Canada. [29] S. Schnabel, T. Vogel, M. Bachmann, and W. Janke, Chem. Phys.Lett. 476, 201 (2009). [30] D. Seaton, T. Wüst, and D. Landau, Phys. Rev. E 81, 011802 (2010). [31] S.Schnabel,D.T.Seaton,D.P.Landau, and M.Bach- [1] W.J.Frawley,G.Piatetsky-Shapiro, andC.J.Matheus, mann, Phys. Rev.E84, 011127 (2011). AI magazine 13, 57 (1992). [32] M. Doi and S. F. Edwards, The Theory of Polymer Dy- [2] T. L. H. Watkin, A. Rau, and M. Biehl, namics. (Oxford Univ.Press, New York,1986). Rev.Mod. Phys. 65, 499 (1993). [33] See Supplemental Material at [URL will be inserted by [3] I. H. Witten and E. Frank, Data Mining: Practical ma- publisher] for thetechnical technical details (2017). chine learning tools and techniques (Morgan Kaufmann, 2005).

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.