ebook img

On Critical Relative Distance of DNA Codes for Additive Stem Similarity PDF

0.11 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview On Critical Relative Distance of DNA Codes for Additive Stem Similarity

On Critical Relative Distance of DNA Codes for Additive Stem Similarity A. D’yachkov, A. Voronina A. Macula, T. Renz and V. Rykov Department of Probability Theory, Air Force Res. Lab., Department of Mathematics, Faculty of Mechanics and Mathematics, IFTC, Rome Research Site, University of Nebraska at Omaha, Moscow State University, Rome NY 13441, USA, 6001 Dodge St., Omaha, Moscow, 119992, Russia, Email: [email protected], NE 68182-0243,USA, Email: [email protected], [email protected]. E-mail: [email protected]. [email protected]. 0 1 0 2 Abstract—We consider DNA codes based on the nearest- take place. It is straightforward to view this problem as one n neighbor (stem) similarity model which adequately reflects the of coding theory [6]. a ”hybridization potential” of two DNA sequences. Our aim is to DNA nanotechnology often requires collections of DNA J presentasurveyofboundsontherateofDNAcodeswithrespect strands called free energy gap codes [7] that will correctly 8 to a thermodynamically motivated similarity measure called an additive stem similarity. These resultsyield a method to analyze ”self-assemble” into Watson-Crick duplexes and do not pro- ] and compare known samples of the nearest neighbor ”thermo- duce erroneous crosshybridizations. When these collections T dynamic weights” associated to stacked pairs that occurred in consist entirely of pairs of mutually reverse complementary I DNA secondary structures. DNA strands they are called DNA tag-antitagsystems [4] and . s DNA codes [7]-[13]. c I. INTRODUCTION [ The best known to date biological model, which is com- Single strands of DNA are represented by oriented se- monlyutilizedtoestimatehybridizationenergyisthe”nearest- 1 quences with elements from alphabet A , {A,C,G,T}. neighbor” similarity model introduced in [1]. Roughly, it v The reverse-complement (Watson-Crick transformation) of a 8 implies that hybridization energy for any two DNA strands DNA strand is defined by first reversing the order of the 7 shouldbecalculatedasasumofthermodynamicweightsofall 2 letters and then substituting each letter x for its complement stems that were formed in the process of hybridization. Stem 1 x¯, namely: A for T, C for G and vice-versa. For exam- is defined as a pair of consecutive DNA letters of either of 1. ple, the reverse complement of AACG is CGTT. For strand the strands, which coalesced with a pair of consecutive DNA 0 x=(x1x2...xn−1xn)∈An ={A,C,G,T}n, let letters of the other DNA strand. This biological model leads 0 1 x=(x¯nx¯n−1...x¯2x¯1)∈An ={A,C,G,T}n (1) to a special similarity function on the space An. First known to authors constructions of DNA codes were : v denote its reverse complement. If y = x, then x = y for suggestedin [9]-[10]. Theywerebasedon conventionalHam- i e X any x ∈ An. If x = x, then x is called a self reverse ming distance codes. Some methods of combinatorial coding r complementary sequence. If x 6= x, theen a pair (x,ex) is theory have been developed [14]-[15] as a means by which a called a pair of mutuallyereverse complementary sequences. such DNA codes can be found. From the very beginning it A (perfect) Watson-Crick duplex isethe joining of opposeitely was understood that hybridization energy for DNA strands directed x and x so that every letter of one strand is paired should be somehow simulated with the similarity function withitscomplementaryletterontheotherstrandinthedouble for sequences from An. But it can be easily noticed, that helix structure,ei.e., x and x are ”perfectly compatible.” How- Hamming similarity does not in the proper degree inherit the ever, when two, not necessarily complementary, oppositely idea of ”nearest-neighbor” similarity model. Thus there is no directed DNA strands aree”sufficiently compatible,” they too wonderthatfurtherexplorationactivitiesprimarilyfocusedon arecapableofcoalescingintoa doublestrandedDNA duplex. the search of appropriate similarity function. The process of forming DNA duplexes from single strands is One exampleof such functionwas proposedin [16], where referred to as DNA hybridization. Crosshybridization occurs it was calculated as the sum of weights of all elements, when two oppositely directed and non-complementary DNA constitutingthelongestcommonHammingsubsequence.Later strands form a duplex. attempts included deletion similarity [8], which was earlier In general, crosshybridization is undesirable as it usually introducedbyLevenshtein[17]andblocksimilarity[12]-[13]. leads to experimental error. To increase the accuracy and Both functions are non-additive which allowed for consider- throughputoftheapplicationslistedin[1]-[5],thereisadesire ation of such cases as shifts of DNA sequences along each to have collections of DNA strands, as large and as mutually other. Nevertheless,all of them still did notcatch the pointof incompatible as possible, so that no crosshybridization can ”nearest-neighbor”similarity model. In 2008 we published our first work [18], devoted to the In addition, study of stem similarity functions. There we considered the simplest case, when similarity between two sequences from Sw(x,y) = Sw(y,x), x,y ∈An. (5) An is equal to the number of stems in the longest common Identity (5) implies the symmetry property of hybridization Hamming subsequence between these two sequences. The e e energy between DNA sequences x and y [7]-[13]. common stem is understood as a block of length 2 which Example 1: In [18] we considered constant weights w = containstwoadjacentelementsofbothoftheinitialsequences. w(a,b)≡1,a,b∈A,forwhichtheadditivestem1-similarity In [19], we introduced the concept of an additive stem w- similarity for an arbitrary weight function w = w(a,b) > 0, S1(x,y), 0 ≤ S1(x,y) ≤ S1(x,x) = n − 1, is the above- mentionednumberof stems in the longestcommonHamming defined for all 16 elements (ab) ∈ A2, called stems. To subsequence between x and y. calculate the additive stem w-similarity between two DNA Example 2: Table 1 shows a biologically motivated collec- sequences one should add up weights of all stems in the tion of weights w(a,b),U(a,b) called [2] unified weights: longest common Hamming subsequence between them (see, U(a,b) b=A b=C b=G b=T below Definition 1). Finally, our recent works [20]-[21] deal with non-additive stem w-similarity function, previously in- a=A 1.00 1.44 1.28 0.88 troduced in [7]. The given model also implies counting the a=C 1.45 1.84 2.17 1.28 . weights of all formed stems between two DNA sequences a=G 1.30 2.24 1.84 1.44 with only difference that these stems are contained not in a=T 0.58 1.30 1.45 1.00 Hamming common subsequence but in subsequence in sense Table 1: Unified weights U(a,b), 1998. of Levenstein insertion-deletion metric. To find more detailed The given values U(a,b) are based on weight samples which discussionofapplicabilityofproposedconstructionsformod- come from [2] and [5] and are the nearest neighbor ”thermo- eling DNA hybridization assays please refer to work [7]. dynamic weights” (e.g., free energy of formation) associated In currentreportwe will summarize main results of [19] in to stacked pairs that occurred in DNA secondary structures. study of asymptotic behaviorof DNA codes maximalsize for See [3] for an introduction to the nearest neighbor model. additive stem w-similarity function. We will show how these Taking into account inequality (4), we give results lead to the development of possible criteria called a Definition 2: [7],[19]. The number critical relative w-distance of DNA codes for distinguishing between weight samples w(a,b) found in different experi- n−1 D (x,y) , S (x,x) − S (x,y)= ηw(x,y), ments. We will also explain, how our consideration prompts w w w i thealgorithmsforcomposingDNAensemblesofoptimalsize Xi=1 for the given length of DNA strands. ηw(x,y),sw(x,x)−sw(x,y)≥0, (6) i i i II. ADDITIVE STEMw-SIMILARITY MODEL is called an additive stem w-distance between x,y∈An. A. Notations and Definitions Let x(j) , (x (j)x (j)...x (j)) ∈ An, j ∈ [N], be 1 2 n Thesymbol,denotesdefinitionalequalitiesandthesymbol codewords of a q-ary code X = {x(1),x(2),...,x(N)} of [n] , {1,2,...,n} denotes the set of integers from 1 to n. length n and size N, where N =2,4,... is an even number. Let w = w(a,b) > 0, a,b ∈ A, be a weight function such Let D, 0 < D ≤ max Sw(x,x), be an arbitrary positive x∈An that number. w(a,b)=w(¯b,a¯), a,b∈A. (2) Definition 3: [7],[19]. AcodeXiscalledaDNAcodeof distance D for additive stem w-similarity (1) (or a (n,D) - Condition(2)meansthatw(a,b)isaninvariantfunctionunder w code) if the following two conditions are fulfilled. (i). For Watson-Crick transformation. any integer j ∈ [N], there exists j′ ∈ [N], j′ 6= j, such that Definition 1: [7],[19]. For x,y∈An, the number x(j′)=x(j)6=x(j).Inotherwords,X isacollectionofN/2 n−1 pairs of mutually reversecomplementarysequences. (ii). The Sw(x,y) , swi (x,y), where minimalgw-distance of code X is i=1 X D (X) , min D (x(j),x(j′))≥D. (7) sw(x,y) , w(a,b) if xi =yi =a, xi+1 =yi+1 =b, w j6=j′ w i (0 otherwise, Let N (n,D) be the maximal size of DNA (n,D) -codes (3) w w for distance (2). If d>0 is a fixed number, then is called an additive stem w-similarity between x and y. Function Sw(x,y) is used to model a thermodynamic simi- R (d) , lim log4 Nw(n,nd), d>0, (8) larity(hybridizationenergy)betweenDNAsequencesxandy. w n→∞ n In virtue of (2)-(1)ethe function is called a rate of DNA (n,nd) -codes for the relative w S (x,y) = S (y,x) ≤ S (x,x), x,y ∈An (4) distance d>0. w w w B. Construction Let p(a,b) p(b,a) Theorem 1: If n=2t+1, t=1,2,..., then p (b|a) , , p (b|a) , 1 2 p (a) p (a) 1 2 N1(n,n−1)=16. denote the corresponding conditional probabilities. It is easy to check, that for distributions p with properties (9)-(10), and Proof: Codewordsof (n,n−1)1-codeshouldnotcon- for the corresponding conditional probabilities, the following tain any commonstems with each other.Note, that |A2|=16 equalities hold true for any a, b∈A: and hence for any (n,n−1)1-code X={x(1),...x(N)} p (a)=p (a)=p (a)=p (a), p (b|a)=p (b|a). (11) |{(x (u)x (u)), u∈[N]}| ≤ |A2| = 16. 1 2 1 2 1 2 1 2 For a fixed weight function (2), introduce values Thus, N1(n,n−1) ≤ 16. T , max T (p), w w Obviously,foroddn,thesetAndoesn’tcontainselfreverse (9) complementary words. For stem a = (a1a2) ∈ A2, define T (p) , p(a,b)−p2(a,b) w(a,b), (12) x(a)=(a a a a ...a a a a )∈An. Code w 1 2 1 2 2 1 2 1 a,b∈A X (cid:0) (cid:1) Xr ={x(a), a∈A2}, |Xr|=42 =16 where the maximum is taken over all distributions p for which condition (9) hold true. Note, that if weightfunction is constituteaDNA(n,n−1)1-codeofsize16foradditivestem invariantunderWatson-Cricktransformation,thenmaximizing 1-similarity. Theorem 1 is proved. distribution of (II-C) will satisfy conditions (10)-(11). Example 3: Forinstance,if n=5, D =n−1=4,then8 pairs of mutually reverse complementary codewords of code Applying an analog of the conventional Plotkin bound [6], X are: one can prove r Theorem 2: [19] If d≥T , then R (d)=0. w w (AAAAA, TTTTT), (ACACA, TGTGT), Letx=(x x ...x )∈An bethestationaryMarkovchain 1 2 n (CCCCC, GGGGG), (CACAC, GTGTG), with initial distribution p (a), a ∈ A, and transition matrix 1 P =kp (b|a)k, a,b∈A, i.e. 1 (AGAGA, TCTCT), (ATATA, TATAT), Pr{x =a},p (a), Pr{x =b|x =a},p (b|a) (CGCGC, GCGCG), (CTCTC, GAGAG). i 1 i+1 i 1 (13) Remark 1: Notethatforanyweightfunctionw,theadditive for any a,b∈A and i∈[n−1]. stem w-similarity S (x(a),x(b)) = 0, a,b ∈ A2, a 6= b. Let a distribution p satisfy (9) and let also the following w Hence, the minimal w-distance (7) of code X is Markov condition M be fulfilled: transition matrix P must r definesuchMarkovchainx=(x x ...x ), thatforanypair 1 2 n Dw(Xr)=min Sw(x(j),x(j)) ≥ 2t·w, of states a,b ∈ A there exists an integer m ∈ [4] such that j the conditional probability Pr{x =b|x =a}>0. m+1 1 where w = min w(a,b). Thus, for any weight function w, Theorem 3: [19] For any probability distribution p, satis- a,b∈A the code X is also a (n,(n−1)·w) -code. For example, fyingcondition(9)andMarkovconditionM,andanyrelative r w for the additive stem U-similarity of Example 2, the number distance d, 0< d <Tw(p), the rate Rw(d)>0. DU(Xr)=2t. Therefore, the code Xr is a (n,n−1)U-code. Theorem 2 is established using the ensemble of random codes where independent codewords x = (x x ...x ) C. Bounds on Rate R (d) 1 2 n w are identically distributed in accordance with the Markov Letp,{p(a,b), a,b∈A}beanarbitraryjointprobability chain (13) and, in virtue of (11), the corresponding reverse distribution on the set of stems (ab)∈A2, i.e., complementcodewordsx=(x¯ x¯ ...x¯ x¯ )havethesame n n−1 2 1 p(a,b)=1, p(a,b)≥0 for any a,b∈A. distribution (13) as well. In addition, the proof of Theorem 2 is based on the Perron-eFrobenius theorem (see [22], Theo- a,b∈A X rem 3.1.1). To describe bounds on the rate R (d), we will consider w Let T (p) be defined by (II-C) and joint probability distributions p, such that the corresponding w marginal probabilities coincide, i.e., for any a∈A TM , max T (p). (14) w w (9),M p (a), p(a,b) = p(b,a) , p (a)>0 (9) 1 2 If T = TM, then the corresponding weight function b∈A b∈A w w X X w =w(a,b) is called regular, and non-regularotherwise. If a and, in addition, function p(a,b), as well as weight func- weight function w =w(a,b) is regular, then T is called the tion (2), is invariant under Watson-Crick transformation, i.e., w critical relative distance of (n,dn) -codes. w p(a,b)=p(b,a) for any a,b∈A. (10) From Theorem 2 and 3 it follows Corollary 1: [19] If a weight function w = w(a,b) is w(A,A)=1.02 b=A b=C b=G b=T regular, then the maximal size of (n,nd) -codes increases a=A 1.00 1.40 1.14 0.72 w exponentially with increasing n if and only if 0<d<T . a=C 1.35 1.74 2.05 1.14 w a=G 1.43 2.24 1.74 1.40 Remark 2: Results of Theorem 2 prompts an idea, that the a=T 0.59 1.43 1.35 1.00 construction of optimal random DNA codes for additive stem w-similarity should be based on generation of independent Table 6: SantaLucia, 1996. MarkovchainswithtransitionmatrixP andinitialdistribution w(A,A)=1.20 b=A b=C b=G b=T p (a),suchthatcorrespondingdistributionpaffordsmaximum 1 a=A 1.00 1.25 1.25 0.75 in (14). a=C 1.42 1.75 2.33 1.25 a=G 1.25 1.92 1.75 1.25 a=T 0.75 1.25 1.42 1.00 III. WEIGHT SAMPLE ANALYSIS BASED ON CRITERION OF CRITICAL RELATIVEDISTANCE Table 7: Sugimoto, 1996. w(A,A)=1.66 b=A b=C b=G b=T In this section, we will discuss samples of weight function a=A 1.00 0.68 0.81 0.72 (or, briefly, weight samples) w = w(a,b), a,b ∈ A, taken a=C 1.08 1.66 1.98 0.81 fromSantaLucia(1998)(seeTable1in[2]).InTables2-8,we a=G 0.85 1.70 1.66 0.68 present weights w(A,A) = w(T,T) and samples of relative a=T 0.46 0.85 1.08 1.00 weightsw(a,b)withrespecttow(A,A),i.e.,foranya,b∈A, Table 8: Breslauer, 1986. w(a,b) w=ew(a,b), , w(a,b)=w(¯b,a¯). (15) A. Analysis of Tables 1-8 for Additive w-Distance w(A,A) Analysis of Table 1 and Tables 3-7: The given weight Pureneumbeersw(a,b)arecomfortaebleforameutualcomparison samples are regular and the maximumein (II-C) is attained and for the comparison with unified weights of Table 1. when p(a,b) = 0 if stem (ab) ∈ L , where the set L of 4 4 forbidden stems in the Markov chain (13) maximizing (II-C) e w(A,A)=0.43 b=A b=C b=G b=T has the form a=A 1.00 2.28 1.93 0.63 a=C 2.32 2.84 3.95 1.93 L4 ,{(AT),(TA),(AA),(TT)}. (16) a=G 2.16 3.81 2.84 2.28 Below, in Table 1’ and Tables 3’-7’,we presentthe estimated a=T 0.51 2.16 2.32 1.00 values of joint probabilities p(a,b) and marginal probabilities p (a) for which the maximum in (II-C) is attained. Values of Table 2: Gotoh, 1981. 1 the critical relative distance Twe are given as well. w(A,A)=0.89 b=A b=C b=G b=T p(a,b) b=A b=C b=G b=T p (a) 1 a=A 1.00 1.35 1.52 0.91 a=A 0 .0589 .0081 0 .067 a=C 1.54 1.84 2.24 1.52 a=C .0610 .1544 .2095 .0081 .433 a=G 1.40 2.20 1.84 1.35 a=G .0060 .2136 .1544 .0589 .433 a=T 0.85 1.40 1.54 1.00 a=T 0 .0060 .0610 0 .067 Table 3: Vologodskii, 1984. Table 1’: Unified weights U(a,b). T =1.58. U w(A,A)=0.67 b=A b=C b=G b=T p(a,b) b=A b=C b=G b=T p (a) 1 a=A 1.00 1.69 1.75 0.93 a=A 0 .0706 .0080 0 .078 a=C 1.78 2.31 2.79 1.75 a=C .0638 .1411 .2087 .0080 .422 a=G 1.67 2.76 2.31 1.69 a=G .0147 .1951 .1411 .0706 .422 a=T 1.04 1.67 1.78 1.00 a=T 0 .0147 .0638 0 .078 Table 4: Blake, 1991. Table 3’: Vologodskii, 1984. Twe =1.61. w(A,A)=0.93 b=A b=C b=G b=T p(a,b) b=A b=C b=G b=T p (a) 1 a=A 1.00 1.63 1.11 0.89 a=A 0 .0331 .0346 0 .068 a=C 1.35 1.80 1.77 1.11 a=C .0406 .1535 .2037 .0346 .432 a=G 1.68 2.62 1.80 1.63 a=G .0270 .2188 .1535 .0331 .432 a=T 0.75 1.68 1.35 1.00 a=T 0 .0270 .0406 0 .068 Table 5: Benight, 1992. Table 4’: Blake, 1991. Twe =1.97. p(a,b) b=A b=C b=G b=T p1(a) B. Conclusion a=A 0 .0675 .0144 0 .082 For regular weight samples from Tables 2-7 (T2-T7), the a=C .0478 .1326 .2234 .0144 .418 descriptive analysis and comparison of critical parameters are a=G .0340 .1841 .1326 .0675 .418 summarized as follows: a=T 0 .0340 .0478 0 .082 T2 T3 T4 T5 T6 T7 Table 5’: Benight, 1992. Twe =1.58. L L L L L L L , 6 4 4 4 4 4 p(a,b) b=A b=C b=G b=T p1(a) Twe 2.60 1.61 1.97 1.58 1.55 1.50 a=A 0 .0608 .0095 0 .070 where the corresponding set L (L = L or L = L ) of 4 6 a=C .0616 .1499 .2087 .0095 .430 forbidden stems in codewords of optimal DNA codes, for a=G .0087 .2102 .1499 .0608 .430 which the critical relative distance Twe can be attained, is a=T 0 .0087 .0616 0 .070 defined by (16) or by (17). Table 6’: SantaLucia, 1996. Twe =1.55. REFERENCES p(a,b) b=A b=C b=G b=T p (a) 1 [1] K.J.Breslauer,R.Frank,H.Blocker,L.A.Markey,”PredictingDuplex a=A 0 .0507 .0140 0 .065 DNA Stability from the Base Sequence,” Proc. National Academy of a=C .0444 .1551 .2217 .0140 .435 Sciences USA,vol.83,pp.3746–3750, 1986. [2] J. SantaLucia, ”A unified view of polymer, dumbbell, and oligonu- a=G .0203 .2091 .1551 .0507 .435 cleotide DNA nearest-neighbor thermodynamics,” Proc. National Aca- a=T 0 .0203 .0444 0 .065 demyofSciences USA,vol.95,pp.1460–1465,1998. [3] M. Zuker, D. Mathews, D. Turner, ”Algorithms and Thermodynamics Table 7’: Sugimoto, 1996. Twe =1.50. for RNA Secondary Structure Prediction: A Practical Guide,” in RNA BiochemistryandBiotechnology,J.Barciszewski&B.F.C.Clark,Eds. NATOASISeries,KluwerAcademicPublishers, 1999. Analysis of Table 2: The given weight sample is regular [4] L. Kaderali, A. Deshpande, J. Nolan, P. White, ”Primer-design for and the maximum in (II-C) is attained when p(a,b) = 0 if multiplexed genotyping,” Nucleic Acids Res., vol. 31, pp. 1796–1802, stem (ab) ∈ L , where the set L of forbidden stems in the 2003. 6 6 [5] J. SantaLucia, D. Hicks, ”The thermodynamics of DNA structural Markov chain (13) maximizing (II-C) has the form motifs,”Annu.Rev.Biophys.Biomol.Struct.,vol.33,pp.415–440,2004. [6] F. J. MacWilliams, N. J. A. Sloane, The Theory of Error-correcting L ={(AT),(TA),(AA),(TT),(AG),(CT)}. (17) 6 Codes,Amsterdam,TheNetherlands: NorthHolland, 1977. [7] M.A.Bishop,A.G.D’yachkov,A.J.Macula,T.E.Renz,V.V.Rykov, Below, in Table 2’, we present the estimated values of joint ”Free Energy Gap and Statistical Thermodynamic Fidelity of DNA p(a,b) and marginal p (a) probabilities for which the max- Codes,” Journal of Computational Biology, vol. 14, n. 8, pp. 1088– 1 1104,2007. imum in (II-C) is attained. The estimated value of critical [8] A.G.D’yachkov, P.A.Vilenkin, D.C.Torney,P.S.White, ”Reverse- relative distances Twe =2.60 is given as well. Complement Similarity Codes for DNA Sequences,” // in Proc. 2000 IEEEInt.Symp.Information Theory,Sorrento, Italy,2000,pp.330. p(a,b) b=A b=C b=G b=T p (a) 1 [9] V.V.Rykov,A.J.Macula,C.M.Korzelius,D.C.Engelhart,D.C.Torney, a=A 0 .0593 0 0 .059 P.C. White, ”DNA Sequences Constructed on the Basis of Quaternary Cyclic Codes”. Proceedings of 4-th World Multiconference on Sys- a=C .0466 .1427 .2515 0 .441 temics,CyberneticsandInformatics,Orlando,Florida,USA,July2000. a=G .0127 .2261 .1427 .0593 .441 [10] A. Marathe, A. E. Condon, R. M. Corn, ”On combinatorial DNA a=T 0 .0127 .0466 0 .059 design,” J.Comp.Biol.,vol.8,pp.201–219,2001. [11] A.G.D’yachkov,P.L.Erdos,A.J.Macula,V.V.Rykov,D.C.Torney, Table 2’: Gotoh, 1981. Twe =2.60. CCo.mS.b.TuOnpgt,imPi.zAat.ioVni,levnokli.n7,,P.n.S4.,Wphpi.te3,6”9E–x3o7r9d,iu2m003fo.rDNACodes,”J. [12] A. G. D’yachkov, A. J. Macula, T. E. Renz, P. A. Vilenkin, I. K. Is- Analysis of Table 8: The given weight sample w is a magilov, ”New Results on DNA Codes,” in Proc. 2005 IEEE Int. Symp. Information Theory, Adelaide, South Australia, Australia, 2005, non-regular weight sample because the maximum in (II-C) is pp.283–288. attained (with the maximal value Twe = 1.70) for probeability [13] A.G.D’yachkov,A.J.Macula,D.C.Torney,P.A.Vilenkin,P.S.White, distribution p′(a,b), (ab) ∈ A2, which does not satisfy I. K. Ismagilov, R. S. Sarbayev, ”On DNA Codes,” Problems of Information Transmission,vol.41,n.4,pp.349–367,2005. Markov condition M and has the form: [14] O. Milenkovic, N. Kashyap, ”New Constructions of Codes for DNA p′(a,b) b=A b=C b=G b=T p′(a) computing,” Proc. 2005 International Workshop on Coding and 1 Cryptography (WCC2005),Bergen, Norway,2005,pp.204-213. a=A .0344 0 0 0 .034 [15] T. Abualrub, A. Ghrayeb, X. N. Zeng, ”Construction of cyclic codes a=C 0 .2190 .2466 0 .466 over GF(4) for DNA computing,” Journal of the Franklin Institute, vol.343,n.4-5,pp.448–457,2006. a=G 0 .2466 .2190 0 .466 [16] A. G. D’yachkov, D. C. Torney, ”On similarity codes,” IEEE Trans. a=T 0 0 0 .0344 .034 Inform.Th.,vol.46,n.4,pp.1558–1664, 2000. [17] V. I. Levenshtein, ”Efficient Reconstruction of Sequences from Their Table 8’: Breslauer, 1986. Twe′ =1.70. Subsequences and Supersequences,” J. Comb. Th., Ser. A, vol. 93, ThisimpliesthatforweightsamplewfromTable8,wecannot pp.310–332, 2001. [18] A. G. D’yachkov, A. N. Voronina, ”DNA Codes Based on Stem estimate the critical relative distance of optimal DNA codes HammingSimilarity,”inProc.11thInt.WorkshopAlgebraicandCom- based on additive stem w-similarity.e binatorial CodingTheory,Pamporovo,Bulgaria, 2008,pp.85–91. e [19] A.G.D’yachkov,A.N.Voronina,”DNACodesforAdditiveStemSimi- larity,”ProblemsofInformationTransmission,vol.45,n.2,pp.348–367, 2009. [20] A. G. D’yachkov, A. J. Macula, T. E. Renz, V. V. Rykov, ”Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNASequences,”in2008IEEEInt.Symp.InformationTheory,Toronto, Canada, 2008,pp.2292–2296. [21] A.G.D’yachkov,A.N.Voronina,A.J.Macula,T.E.Renz,V.V.Rykov, ”DNA Codes for the Nearest-Neighbor Similarity,” submitted to IEEE Trans.Inform.Th.. [22] Dembo,A.,Zeitouni,O., LargeDeviationsTechniquesandApplications, Boston,MA:JonesandBartlett, 1993.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.