ebook img

The use of information theory in evolutionary biology - The Adami Lab PDF

17 Pages·2012·0.56 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview The use of information theory in evolutionary biology - The Adami Lab

Ann.N.Y.Acad.Sci.ISSN0077-8923 ANNALS OF THE NEW YORK ACADEMY OF SCIENCES Issue:TheYearinEvolutionaryBiology The use of information theory in evolutionary biology ChristophAdami1,2,3 1DepartmentofMicrobiologyandMolecularGenetics,MichiganStateUniversity,EastLansing,Michigan.2Departmentof PhysicsandAstronomy,MichiganStateUniversity,EastLansing,Michigan.3BEACONCenterfortheStudyofEvolutionin Action,MichiganStateUniversity,EastLansing,Michigan Addressforcorrespondence:C.Adami,DepartmentofMicrobiologyandMolecularGenetics,2209BiomedicalandPhysical SciencesBuilding,MichiganStateUniversity,EastLansing,[email protected] Informationisakeyconceptinevolutionarybiology.Informationstoredinabiologicalorganism’sgenomeisused togeneratetheorganismandtomaintainandcontrolit.Informationisalsothatwhichevolves.Whenapopulation adaptstoalocalenvironment,informationaboutthisenvironmentisfixedinarepresentativegenome.However, when an environment changes, information can be lost. At the same time, information is processed by animal brainstosurviveincomplexenvironments,andthecapacityforinformationprocessingalsoevolves.Here,Ireview applicationsofinformationtheorytotheevolutionofproteinsandtotheevolutionofinformationprocessingin simulatedagentsthatadapttoperformacomplextask. Keywords: informationtheory;evolution;proteinevolution;animatevolution thattheyaretheresultoftheevolutionarymecha- Introduction nismsatwork—isnotindoubt.Yet,mathematical Evolutionary biology has traditionally been a sci- populationgeneticscannotquantifythembecause encethatusedobservationandtheanalysisofspec- thetheoryonlydealswithexistingvariation.Atthe imens to draw inferences about common descent, sametime,theuniquenessofanyparticularlineof adaptation, variation, and selection.1,2 In contrast descentappearstoprecludeagenerativeprinciple, tothisdisciplinethatrequiresfieldworkandmetic- oraframeworkthatwouldallowustounderstand ulous attention to detail, stands the mathematical thegenerationoftheselinesfromaperspectiveonce theory of population genetics,3,4 which developed removed from the microscopic mechanisms that inparallelbutsomewhatremovedfromevolution- shapegenesonemutationatthetime.Inthelast24 ary biology, as it could treat exactly only very ab- yearsorso,thesituationhaschangeddramatically stractcases.ThemathematicaltheorycastDarwin’s becauseoftheadventoflong-termevolutionexper- insight about inheritance, variation, and selection imentswithreplicatelinesofbacteriaadaptingfor intoformulaethatcouldpredictparticularaspects over50,000generations,5,6andinsilicoevolutionex- oftheevolutionaryprocess,suchastheprobability perimentscoveringmillionsofgenerations.7,8Both thatanallelethatconferredaparticularadvantage experimental approaches, in their own way, have wouldgotofixation,how longthisprocesswould provideduswithkeyinsightsintotheevolutionof take,andhowtheprocesswouldbemodifiedbydif- complexityonmacroscopictimescales.6,8–14 ferentformsofinheritance.Missingfromthesetwo But there is a common concept that unifies the disciplines,however,wasaframeworkthatwouldal- digitalandthebiochemicalapproach:information. lowustounderstandthebroadmacro-evolutionary That information is the essence of “that which arcs that we can see everywhere in the biosphere evolves” has been implicit in many writings (al- and in the fossil record—the lines of descent that thoughtheword“information”doesnotappearin connect simple to complex forms of life. Granted, Darwin’sOntheOriginofSpecies).Indeed,shortly theexistenceoftheseunbrokenlines—andthefact afterthegenesisofthetheoryofinformationatthe doi:10.1111/j.1749-6632.2011.06422.x Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. 49 ! Informationtheoryinevolutionarybiology Adami hands of a Bell Laboratories engineer,15 this the- thenwecanfindthosegeneralprinciplesthathave ory was thought to ultimately explain everything eludedussofar. fromthehigherfunctionsoflivingorganismsdown In this review, I focus on two uses of informa- tometabolism,growth,anddifferentiation.16How- tiontheoryinevolutionarybiology:First,thequan- ever, this optimism soon gave way to a miasma of tification of the information content of genes and confoundingmathematicalandphilosophicalargu- proteinsandhowthisinformationmayhaveevolved ments that dampened enthusiasm for the concept alongthebranchesofthetreeoflife.Second,theevo- ofinformationinbiologyfordecades.Tosomeex- lution of information-processing structures (such tent, evolutionary biology was not yet ready for a asbrains)thatcontrolanimals,andhowthefunc- quantitativetreatmentof“thatwhichevolves:”the tional complexity of these brains (and how they year of publication of “Information in Biology”16 evolve)couldbequantifiedusinginformationthe- coincided with the discovery of the structure of ory. The latter approach reinforces a concept that DNA, and the wealth of sequence data that cata- hasappearedinneurosciencerepeatedly:thevalue pultedevolutionarybiologyintothecomputerage ofinformationforanadaptedorganismisfitness,18 wasstillhalfacenturyaway. and the complexity of an organism’s brain must Colloquially, information is often described as be reflected in how it manages to process, inte- something that aids in decision making. Interest- grate, and make use of information for its own ingly,thisisveryclosetothemathematicalmeaning advantage.19 of“information,”whichisconcernedwithquanti- Entropyandinformationinmolecular fying the ability to make predictions about uncer- sequences tain systems. Life—among many other aspects— has the peculiar property of displaying behavior Todefineentropyandinformation,wefirstmustde- or characters that are appropriate, given the envi- finetheconceptofarandomvariable.Inprobability ronment.Werecognizethisofcourseastheconse- theory,arandomvariableXisamathematicalobject quence of adaptation, but the outcome is that the thatcantakeonafinitenumberofdifferentstates adaptedorganism’sdecisionsare“intune”withits x x with specified probabilities p ,...,p . 1 N 1 N ··· environment—theorganismhasinformationabout Weshouldkeepinmindthatamathematicalran- its environment. One of the insights that has domvariableisadescription—sometimesaccurate, emergedfromthetheoryofcomputationisthatin- sometimesnot—ofaphysicalobject.Forexample, formation must be physical—information cannot therandomvariablethatwewouldusetodescribea existwithoutaphysicalsubstratethatencodesit.17 faircoinhastwostates:x headsandx tails, 1 2 = = Incomputers,informationisencodedinzerosand withprobabilities p p 0.5.Ofcourse,anac- 1 2 = = ones, which themselves are represented by differ- tualcoinisafarmorecomplexdevice—itmaydevi- ent voltages on semiconductors. The information atefrombeingtrue,itmaylandonanedgeoncein weretaininourbrainsalsohasaphysicalsubstrate, awhile,anditsfacescanmakedifferentangleswith eventhoughitsphysiologicalbasisdependsonthe true North. Yet, when coins are used for demon- type of memory and is far from certain. Context- strationsinprobabilitytheoryorstatistics,theyare appropriatedecisionsrequireinformation,however most succinctly described with two states and two itisstored.Forcells,wenowknowthatthisinfor- equal probabilities. Nucleic acids can be described mationisstoredinacell’sinheritedgeneticmaterial, probabilisticallyinasimilarmanner.Wecandefine andispreciselythekindthatShannondescribedin a nucleic acid random variable X as having four his 1948 articles. If inherited genetic material rep- states x A,x C,x G,andx T, which 1 2 3 4 = = = = resentsinformation,thenhowdidtheinformation- it can take on with probabilities p ,...,p , while 1 4 carryingmoleculesacquireit?Istheamountofin- beingperfectlyawarethatthenucleicacidmolecules formation stored in genes increasing throughout themselves are far more complex, and deserve evolution,andifso,why?Howmuchinformation a richer description than the four-state abstrac- doesanorganismstore?Howmuchinasinglegene? tion. But given the role that these molecules If we can replace a discussion of the evolution of play as information carriers of the genetic mate- complexityalongthevariouslinesofdescentwitha rial, this abstraction will serve us very well going discussionoftheevolutionofinformation,perhaps forward. 50 Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. ! Adami Informationtheoryinevolutionarybiology Entropy The entropy of position variable X can now be 28 UsingtheconceptofarandomvariableX,wecan estimatedas define its entropy (sometimes called uncertainty) 5 5 17 17 as20,21 H(X ) –2 log – log 28 = × 33 2 33 33 2 33 N 6 6 (4) H(X) – p logp . (1) – log 1.765bits, = i i 33 2 33 ≈ i 1 != Here,thelogarithmistakentoanarbitrarybase or less than the maximal 2 bits we would expect thatwillnormalize(andgiveunitsto)theentropy. if all nucleotides appeared with equal probability. Ifwechoosetheduallogarithm,theunitsare“bits,” Suchanunevendistributionofstatesimmediately whereas if we choose base e, the units are “nats.” suggestsa“betting”strategythatwouldallowusto Here, I will often choose the size of the alphabet makepredictionswithaccuracybetterthanchance as the base of the logarithm, and call the unit the about the state of position variable X28: If we bet “mer.”22 So, if we describe nucleic acid sequences that we would see a “C” there, then we would be (alphabetsize4),asinglenucleotidecanhaveupto right over half the time on average, as opposed to 1“mer”ofentropy,whereasifwedescribeproteins a quarter of the time for a variable that is evenly (logarithms taken to the base 20), a single residue distributed across the four states. In other words, canhaveupto1merofentropy.Naturally,a5-mer informationisstoredinthisvariable. hasupto5mersofentropy,andsoon. Information A true coin, we can immediately convince our- Tolearnhowtoquantifytheamountofinformation selves, has an entropy of 1 bit. A single random stored,letusgothroughthesameexerciseforadif- nucleotide, by the same reasoning, has an entropy ferentposition(say,position41a)ofthatmolecule, of1mer(or2bits)because tofindapproximately 4 H(X) – 1/4log 1/4 1. (2) p41(A) 0.24,p41(C) 0.46, = 4 = = = !i=1 p41(G) 0.21,p41(T) 0.09, (5) = = Whatistheentropyofanonrandomnucleotide? so that H(X ) 1.765 bits. To determine how To determine this, we have to find the probabil- 41 ≈ likelyitistofindanyparticularnucleotideatposi- ities p with which that nucleotide is found at a i tion41givenposition28isa“C,”forexample,we particular position within a gene. Say we are in- have to collect conditional probabilities. They are terested in nucleotide 28 (counting from 5 to 3) " " easily obtained if we know the joint probability to of the 76 base pair tRNA gene of the bacterium observeanyofthe16combinationsAA...TTatthe Escherichia coli. What is its entropy? To determine two positions. The conditional probability to ob- this, we need to obtain an estimate of the proba- servestatej atposition41givenstatei atposition bilitythatanyofthefournucleotidesarefoundat 28is that particular position. This kind of information can be gained from sequence repositories. For ex- p p ij, (6) ample, the database tRNAdb23 contains sequences i|j = pj for more than 12,000 tRNA genes. For the E. coli tRNAgene,among33verifiedsequences(fordiffer- wherepij isthejointprobabilitytoobservestateiat entanticodons),wefind5thatshowan“A”atthe position28andatthesametimestatejatposition 28th position, 17 have a “C,” 5 have a “G,” and 6 41. The notation “i j” is read as “i given j.” Col- | have a “T.” We can use these numbers to estimate lecting these probabilities from the sequence data thesubstitutionprobabilitiesatthispositionas givestheprobabilitymatrixthatrelatestherandom p (A) 5/33,p (C) 17/33, 28 28 = = p28(G) 5/33,p28(T) 6/33, (3) = = which,eventhoughthestatisticsarenotgood,allow aThe precise numbering of nucleotide positions differs us to infer that “C” is preferred at that position. betweendatabases. Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. 51 ! Informationtheoryinevolutionarybiology Adami variable X tothevariable X : Wecannowproceedandcalculatetheinformation 28 41 content. Each column in Eq. (7) represents a con- p(X X ) 41 28 | ditional probability to find a particular nucleotide p(A A) p(A C) p(A G) p(A T) at position 41, given a particular value is found at | | | | p(C A) p(C C) p(C G) p(C T) position28.Wecanusethesevaluestocalculatethe | | | | conditionalentropytofindaparticularnucleotide, =p(G|A) p(G|C) p(G|G) p(G|T) giventhatposition28is“A,”forexample,as  p(T A) p(T C) p(T G) p(T T)  | | | |  H(X41|X28 =A)   0.2 0.235 0 0.5 –0.2log 0.2–0.8log 0.8 0.72bits. (9) = 2 2 ≈  0 0.706 0.2 0.333 Thisallowsustocalculatetheamountofinforma- . tion that is revealed (about X ) by knowing the =0.8 0 0.4 0.167 41   stateofX28.IfwedonotknowthestateofX28,our  0 0.059 0.4 0  (7) uncertainty about X41 is 1.795 bits, as calculated   Wecangleanimportantinformationfromthese earlier.Butrevealingthat X28actuallyisan“A”has reduced our uncertainty to 0.72 bits, as we saw in probabilities.Itisclear,forexample,thatpositions Eq. (9). The information we obtained is then just 28and41arenotindependentfromeachother.If thedifference nucleotide 28 is an “A,” then position 41 can only be an “A” or a “G,” but mostly (4/5 times) you I(X41 : X28 A) H(X41)–H(X41 X28 A) = = | = expecta“G.”Butconsiderthedependencebetween 1.075bits, (10) ≈ nucleotides42and28 that is, just over 1 bit. The notation in Eq. (10), 0 0 0 1 indicatinginformationbetweentwovariablesbya 0 0 1 0 colon(sometimesasemicolon)isconventional.We   p(X42|X28)=0 1 0 0. (8) canalsocalculatetheaverageamountofinformation   aboutX41thatisgainedbyrevealingthestateofX28 1 0 0 0 as     I(X : X ) H(X )–H(X X ) This dependence is striking—if you know posi- 41 28 = 41 41| 28 tion 28, you can predict (based on the sequence 0.64bits. (11) ≈ data given) position 42 with certainty. The reason Here, H(X X )istheaverageconditionalen- forthisperfectcorrelationliesinthefunctionalin- 41| 28 tropy of X given X , obtained by averaging the teractionbetweenthesites:28and42arepairedin 41 28 four conditional entropies (for the four possible a stem of the tRNA molecule in a Watson–Crick states of X ) using the probabilities with which pair—toenablethepairing,a“G”mustbeassoci- 28 X occursinanyofitsfourstates,givenbyEq.(3). atedwitha“C,”anda“T”(encodingaU)mustbe 28 Ifweapplythesamecalculationtothepairofpo- associatedwithan“A.”Itdoesnotmatterwhichis sitions X and X , we should find that knowing at any position as long as the paired nucleotide is 42 28 X reduces our uncertainty about X to zero— complementary. And it is also clear that these as- 28 42 indeed, X carriesperfectinformationabout X . sociationsaremaintainedbytheselectivepressures 28 42 The covariance between residues in an RNA sec- ofDarwinianevolution—asubstitutionthatbreaks ondary structure captured by the mutual entropy thepatternleadstoamoleculethatdoesnotfoldinto canbeusedtopredictsecondarystructurefromse- thecorrectshapetoefficientlytranslatemessenger quencealignmentsalone.24 RNAintoproteins.Asaconsequence,theorganism bearing such a mutation will be eliminated from Informationcontentofproteins the gene pool. This simple example shows clearly We have seen that different positions within a the relationship between information theory and biomoleculecancarryinformationaboutotherpo- evolutionary biology: Fitness is reflected in infor- sitions, but how much information do they store mation,andwhenselectivepressuresmaximizefit- about the environment within which they evolve? ness,informationmustbemaximizedconcurrently. This question can be answered using the same 52 Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. ! Adami Informationtheoryinevolutionarybiology information-theoretic formalism introduced ear- Watson–Crickbindingtoestablishtheirsecondary lier. Information is defined as a reduction in our structure.Inproteins,correlationsbetweenresidues uncertainty(causedbyourabilitytomakepredic- aremuchweaker(butcertainlystillimportant,see, tions with an accuracy better than chance) when e.g., Refs. 25–33), and we can take Eq. (15) as a armedwithinformation.Herewewilluseproteins first-order approximation of the information con- asourbiomolecules,whichmeansourrandomvari- tent, while keeping in mind that residue–residue ablescantakeon20states,andourproteinvariable correlations encode important information about willbegivenbythejointvariable thestabilityoftheproteinanditsfunctionalaffinity toothermolecules.Note,however,thatapopulation X X X X , (12) = 1 2··· L withtwoormoresubdivisions,whereeachsubpop- where L is the number of residues in the protein. ulation has different amino acid frequencies, can We now ask: “How much information about the mimicresiduecorrelationsonthelevelofthewhole environment (rather than about another residue) populationwhentherearenoneonthelevelofthe is stored in a particular residue?” To answer this, subpopulations.34 wehavetofirstcalculatetheuncertaintyaboutany For most cases that we will have to deal with, a particular residue in the absence of information proteinisonlyfunctionalinaverydefinedcellular about the environment. Clearly, it is the environ- environment,andasaconsequencetheconditional ment within which a protein finds itself that con- entropyofaresidueisfixedbythesubstitutionprob- strains the particular amino acids that a position abilitiesthatwecanobserve.Letustakeasanexam- variablecantakeon.IfIdonotspecifythisenviron- pletherodenthomeodomainprotein,35 definedby ment,thereisnothingthatconstrainsanyparticu- 57residues.Theenvironmentforthisproteinisof lar residue i, and as a consequence the entropy is coursetherodent,andwemightsurmisethatthein- maximal formationcontentofthehomeodomainproteinin rodentsisdifferentfromthehomeodomainprotein H(Xi)= Hmax =log220≈4.32bits. (13) in primates, forexample, simply becauseprimates androdentshavedivergedabout100millionyears In any functional protein, the residue is highly ago,36 andhavesincethentakenindependentevo- constrained,however.Letusimaginethatthepos- lutionarypaths.Wecantestthishypothesisbycal- siblestatesoftheenvironmentcanbedescribedby culatingtheinformationcontentofrodentproteins arandomvariableE(thattakesonspecificenviron- andcompareittotheprimateversion,usingsubsti- mentalstatese withgivenprobabilities).Thenthe j tutionprobabilitiesinferredfromsequencedatathat informationaboutenvironment E e contained = j canbefound,forexample,inthePfamdatabase.37 inpositionvariable X ofprotein X isgivenby i Letusfirstlookattheentropyperresidue,alongthe I(X : E e ) H –H(X E e ), chainlengthofthe57mer.Butinsteadofcalculating i j max i j = = | = (14) theentropyinbits(bytakingthebase-2logarithm), we will calculate the entropy in “mers,” by taking inperfectanalogytoEq.(10).Howdowecalculate thelogarithmtobase20.Thisway,asingleresidue theinformationcontentoftheentireprotein,armed canhaveatmost1merofentropy,andthe57-mer only with the information content of residues? If hasatmost57mersofentropy.Theentropicpro- residuesdonotinteract(thatis,thestateofaresidue file (entropy per site as a function of site) of the at one position does not reveal any information rodenthomeodomainproteindepictedinFigure1 about the state of a residue at another position), showsthattheentropyvariesconsiderablyfromsite thentheinformationcontentoftheproteinwould tosite,withstronglyconservedandhighlyvariable just be a sum of the information content of each residues. residue Whenestimatingentropiesfromfiniteensembles L (small number of sequences), care must be taken I(X : E e ) I(X : E e ). (15) j i j tocorrectforthebiasthatisinherentinestimating = = = !i=1 theprobabilitiesfromthefrequencies.Rareresidues This independence of positions certainly could willbeassignedzeroprobabilitiesinsmallensem- not be assumed in RNA molecules that rely on blesbutnotinlargerones.Becausethiserrorisnot Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. 53 ! Informationtheoryinevolutionarybiology Adami odomainproteinweobtain I 25.43 0.08mers, (19) Primates = ± whichisidenticaltotheinformationcontentofro- denthomeodomainswithinstatisticalerror.Wecan thusconcludethatalthoughtheinformationisen- codedsomewhatdifferentlybetweentherodentand theprimateversionofthisprotein,thetotalinfor- mationcontentisthesame. Evolutionofinformation Figure1. Entropicprofileofthe57-aminoacidrodenthome- Although the total information content of the odomain,obtainedfrom810sequencesinPfam(accessedFebru- homeodomainproteinhasnotchangedbetweenro- ary3,2011).Errorofthemeanissmallerthanthedatapoints dentsandprimates, whataboutlongertime inter- shown. Residues are numbered 2–58 as is common for this vals?Ifwetakeaproteinthatisubiquitousamong domain.35 differentformsoflife(i.e.,itshomologueispresent in many different branches), has its information symmetric (probabilities will always be underesti- content changed as it is used in more and more mated),thebiasisalwaystowardsmallerentropies. complex forms of life? One line of argument tells Severalmethodscanbeappliedtocorrectforthis, us that if the function of the protein is the same andIhaveusedherethesecond-orderbiascorrec- throughoutevolutionaryhistory,thenitsinforma- tion,describedforexampleinRef.38.Summingup tioncontentshouldbethesameineachvariant.We theentropiespersiteshowninFigure1,wecanget sawahintofthatwhencomparingtheinformation anestimatefortheinformationcontentbyapplying content of the homeodomain protein between ro- Eq. (15). The maximal entropy Hmax, when mea- dents and primates. But we can also argue instead sured in mers, is of course 57, so the information thatbecauseinformationismeasuredrelativetothe contentisjust environment the protein (and thus the organism) findsitselfin,thenorganismsthatliveinverydif- 57 ferent environments can potentially have different I 57– H(X e ), (16) Rodentia i Rodentia = | informationcontentevenifthesequencesencoding i 1 != the proteins are homologous. Thus, we could ex- whichcomesoutto pect differences in protein information content in organismsthataredifferentenoughthattheprotein I 25.29 0.09mers, (17) Rodentia isusedindifferentways.Butitiscertainlynotclear = ± whetherweshouldobserveatrendofincreasingor where the error of the mean is obtained from the decreasinginformationalongthelineofdescent.To theoretical estimate of the variance given the fre- getafirstglimpseatwhatthesedifferencescouldbe quencyestimate.38 like, I will take a look here at the evolution of in- Thesameanalysiscanberepeatedfortheprimate formationintwoproteinsthatareimportantinthe homeodomainprotein.InFigure2,wecanseethe functionofmostanimals—thehomeodomainpro- differencebetweenthe“entropicprofile”ofrodents teinandtheCOX2(cytochrome-c-oxidasesubunit andprimates 2)protein. !Entropy H(X e )–H(X e ), Thehomeodomain(orhomeobox)proteinises- i Rodentia i Primates = | | sential in determining the pattern of development (18) in animals—it is crucial in directing the arrange- whichshowssomesignificantdifferences,inpartic- mentofcellsaccordingtoaparticularbodyplan.39 ular,itseems,attheedgesbetweenstructuralmotifs In other words, the homeobox determines where intheprotein. theheadgoesandwherethetailgoes.Althoughitis Whensumminguptheentropiestoarriveatthe oftensaidthattheseproteinsarespecifictoanimals, total information content of the primate home- some plants have homeodomain proteins that are 54 Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. ! Adami Informationtheoryinevolutionarybiology level), we move closer toward evolutionarily more ancient proteins, in particular because this group (thegreatapes)isusedtoreconstructthesequence of the ancestor of that group. The great apes are butonefamilyoftheorderprimateswhichbesides the great apes also contains the families of mon- keys,lemurs,lorises,tarsiers,andgalagos.Looking at the homeobox protein of all the primates then takesusfurtherbackintime.Asimplified version of the phylogeny of animals is shown in Figure 3, which shows the hierarchical organization of the tree. Figure2. Differencebetweenentropicprofile“!Entropy”of ThedatabasePfamusesarangeofdifferenttax- thehomeoboxproteinofrodentsandprimates(thelatterfrom onomic levels (anywhere from 12 to 22, depend- 903sequencesinPfam,accessedFebruary3,2011).Errorbars ingonthebranch)definedbytheNCBITaxonomy aretheerrorofthemeanofthedifference,usingtheaverageof Project,47 whichwecantakeasaconvenientproxy thenumberofsequences.Thecoloredboxesindicatestructural domainsasdeterminedfortheflyversionofthisgene.(“N” fortaxonomicdepth—rangingfromthemostbasal referstotheprotein’s“N-terminus”). taxonomic identifications (such as phylum) to the mostspecificones.InFigure4,wecanseethetotal sequenceentropy homologoustothoseIstudyhere.40TheCOX2pro- 57 tein,ontheotherhand,isasubunitofalargeprotein H (X) H(X e ), (20) complex with 13 subunits.41 Whereas a nonfunc- k = i| k i 1 tioning (or severely impaired) homeobox protein != certainlyleadstoaborteddevelopment,animpaired forsequenceswiththeNCBItaxonomiclevelk,asa COXcomplexhasamuchlessdrasticeffect—itleads functionoftheleveldepth.Notethatsequencesat to mitochondrial myopathy due to a cytochrome levelkalwaysincludeallthesequencesatlevelk–1. oxidasedeficiency,42butisusuallynotfatal.43Thus, Thus, H (X), which is the entropy of all home- 1 bytestingthechangeswithinthesetwoproteins,we odomainsequencesatlevelk 1,includesthese- = areexaminingproteinswithverydifferentselective quencesofalleukaryotes.Ofcourse,thetaxonomic pressuresactingonthem. leveldescriptionisnotaperfectproxyfortime.On To calculate the information content of each of thevertebrateline,forexample,thegenusHomooc- these proteins along the evolutionary line of de- cupieslevelk 14,whereasthegenusMusoccupies = scent,inprincipleweneedaccesstothesequences levelk 16.IfwenowplotH (X)versusk(forthe k = ofextinctformsoflife.Eventhoughtheresurrection majorphylogeneticgroupsonly),weseeacurious ofsuchextinctsequencesispossibleinprinciple44 splittingofthelinesbasedonlyontotalsequenceen- usinganapproachdubbed“paleogenetics,”45,46 we tropy,andthusinformation(asinformationisjust cantakeashortcutbygroupingsequencesaccord- I 57–H ifwemeasureentropyinmers).Atthe = ing to the depth that they occupy within the phy- baseofthetree,themetazoansequencessplitinto logenetic tree. So when we measure the informa- chordate proteins with a lower information con- tion content of the homeobox protein on the tax- tent(higherentropy)andarthropodsequenceswith onomiclevelofthefamily,weincludeintherethe higherinformationcontent,possiblyreflectingthe sequences of homeobox proteins of chimpanzees, differentusesofthehomeoboxinthesetwogroups. gorillas,andorangutansalongwithhumans.Asthe The chordate group itself splits into mammalian chimpanzeeversion,forexample,isessentiallyiden- proteinsandthefishhomeodomain.Thereisevena tical with the human version, we do not expect to notablesplitininformationcontentintotwomajor seeanychangeininformationcontentwhenmov- groupswithinthefishes. ing from the species level to the genus level. But The same analysis applied to subunit II of the we can expect that by grouping the sequences on COX protein (counting only 120 residue sites that the family level (rather than the genus or species havesufficientstatisticsinthedatabase)givesavery Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. 55 ! Informationtheoryinevolutionarybiology Adami Figure3. Simplifiedphylogeneticclassificationofanimals.Attherootofthistree(onthelefttree)aretheeukaryotes,butonly theanimalbranchisshownhere.Ifwefollowthelineofdescentofhumans,wemoveonthebranchtowardthevertebrates.The vertebratecladeitselfisshowninthetreeontheright,andthelineofdescentthroughthistreefollowsthebranchesthatendinthe mammals.Themammaltree,finally,isshownatthebottom,withthelineendinginHomosapiensindicatedinred. differentpicture.Exceptforanobvioussplitofthe whatallowsthebearertomakepredictions(about bacterialversionoftheproteinandtheeukaryotic that particular system) with accuracy better than one, the total entropy markedly decreases across chance,informationisvaluableaslongasprediction thelinesasthetaxonomicdepthincreases.Further- is valuable. In an uncertain world, making accu- more, the arthropod COX2 is more entropic than ratepredictionsistantamounttosurvival.Inother the vertebrate one (see Fig. 5) as opposed to the words, we expect that information, acquired from ordering for the homeobox protein. This finding theenvironmentandprocessed,hassurvivalvalue suggeststhattheevolutionoftheproteininforma- andthereforeisselectedforinevolution. tion content is specific to each protein, and most Predictiveinformation likely reflects the adaptive value of the protein for The connection between information and fitness eachfamily. can be made much more precise. A key relation between information and its value for agents that Evolutionofinformationinrobots survive in an uncertain world as a consequence of andanimats theiractionsinitwasprovidedbyAyetal.,49 who The evolution of information within the genes of applied a measure called “predictive information” adapting organisms is but one use of information (defined earlier by Bialek et al.50 in the context of theory in evolutionary biology. Just as anticipated dynamical systems theory) to characterize the be- in the heydays of the “Cybernetics” movement,48 havioralcomplexityofanautonomousrobot.These information theory has indeed something to say authorsshowedthatthemutualentropybetweena about the evolution of information processing in changing world (as represented by changing states animalbrains.Thegeneralideabehindtheconnec- in an organism’s sensors) and the actions of mo- tion between information and function is simple: torsthatdrivetheagent’sbehavior(thuschanging Becauseinformation(aboutaparticularsystem)is thefutureperceivedstates)isequivalenttoBialek’s 56 Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. ! Adami Informationtheoryinevolutionarybiology Figure4. Entropyofhomeoboxdomainproteinsequences(PF00046inthePfamdatabase,accessedJuly20,2006)asafunctionof taxonomicdepthfordifferentmajorgroupsthathaveatlast200sequencesinthedatabase,connectedbyphylogeneticrelationships. Selectedgroupsareannotatedbyname.Fifty-sevencoreresidueswereusedtocalculatethemolecularentropy.Coreresidueshave atleast70%sequenceinthedatabase. predictive information as long as the agent’s deci- states—thatis,theuncertaintyaboutwhatthede- sionsareMarkovian,thatis,onlydependonthestate tectorswillrecordnext—isexplainedbythemotor oftheagentandtheenvironmentatthepreceding states at the preceding time point. For example, if time.Thispredictiveinformationisdefinedasthe the motor states at time t perfectly predict what sharedentropybetweenmotorvariablesY andthe will appear in the sensors at time t + 1, then the t sensorvariablesatthesubsequenttimepoint X predictiveinformationismaximal.Anotherversion t+1 ofthepredictiveinformationstudiesnottheeffect I I(Y : X ) H(X )–H(X Y). pred = t t+1 = t+1 t+1| t the motors have on future sensor states, but the (21) effect the sensors have on future motor states in- Here, H(X )istheentropyofthesensorstates stead, for example to guide an autonomous robot t+1 attimet +1,definedas through a maze.51 In the former case, the pre- dictive information quantifies how actions change H(X ) – p(x )logp(x ), (22) t+1 t+1 t+1 the perceived world, whereas in the latter case the = !xt+1 predictive information characterizes how the per- using the probability distribution p(x ) over the ceivedworldchangestherobot’sactions.Bothfor- t+1 sensor states x at time t + 1. The conditional mulations, however, are equivalent when taking t+1 entropy H(X Y)characterizeshowmuchisleft intoaccounthowworldandrobotstatesarebeing t+1 t | uncertainaboutthefuturesensorstates X given updated.51Althoughitisclearthatmeasuressuchas t+1 the robot’s actions in the present, that is, the state predictiveinformationshouldincreaseasanagent ofthemotorsattimet,andcanbecalculatedinthe orrobotlearnstobehaveappropriatelyinacomplex standardmanner20,21fromthejointprobabilitydis- world,itisnotatallclearwhetherinformationcould tributionofpresentmotorstatesandfuturesensor beusedasanobjectivefunctionthat,ifmaximized, states p(x ,y ). willleadtoappropriatebehavioroftherobot.This t+1 t As Eq. (21) implies, the predictive information isthebasichypothesisofLinsker’s“Infomax”prin- measures how much of the entropy of sensorial ciple,52 whichpositsthatneuralcontrolstructures Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. 57 ! Informationtheoryinevolutionarybiology Adami Figure5. EntropyofCOXsubunitII(PF00116inthePfamdatabase,accessedJune22,2006)proteinsequencesasafunction oftaxonomicdepthforselecteddifferentgroups(atleast200sequencespergroup),connectedbyphylogeneticrelationships.One hundredtwentycoreresidueswereusedtocalculatethemolecularentropy. evolvetomaximize“informationpreservation”sub- perimentssuggestthattheremaybeadeepercon- jecttoconstraints.Thishypothesisimpliesthatthe nectionbetweeninformationandfitnessthatgoes infomax principle could play the role of a guiding beyond the regularities induced by a perception– forceintheorganizationofperceptualsystems.This actionloop,thatconnectsfitness(intheevolution- ispreciselywhathasbeenobservedinexperiments arysenseasthegrowthrateofapopulation)directly withautonomousrobotsevolvedtoperformava- toinformation. riety of tasks. For example, in one task visual and Asamatteroffact,RivoireandLeibler18recently tactileinformationhadtobeintegratedtograban studiedabstractmodelsofthepopulationdynamics object,53 whereas in another, groups of five robots ofevolving“finite-stateagents”thatoptimizetheir were evolved to move in a coordinated fashion54 responsetoachangingenvironmentandfoundjust or else to navigate according to a map.55 Such ex- such a relationship. In such a description, agents 58 Ann.N.Y.Acad.Sci.1256(2012)49–65 c 2012NewYorkAcademyofSciences. !

Description:
applications of information theory to the evolution of proteins and to the evolution of information processing theory of population genetics,3,4 which developed.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.