ebook img

Codes: An Introduction to Information Communication and Cryptography (Solutions) (Instructor's Solution Manual) PDF

26 Pages·2008·0.202 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Codes: An Introduction to Information Communication and Cryptography (Solutions) (Instructor's Solution Manual)

Codes: An Introduction to Information Communication and Cryptography by Norman Biggs Answers to Exercises Insomecasesthe‘answer’isjustahint,inothersthereisafulldiscussion. The answers to the odd-numbered exercises are as published in the book. 1 Chapter 1 1.1. ‘Canine’ has six letters and ends in ‘nine’. The second message has two possible interpretations. 1.2. About 136 years. 1.3. The mathematical bold symbols A and B. 1.4. Modern Greek 24, Russian Cyrillic 33. 1.5. This exercise illustrates the point that decoding a message requires the making and testing of hypotheses. Here the rules are fairly simple, but that is notalwaysso. Inthefirstexample, itisafairguessthatthenumbersrepresent letters, and the simplest way of doing that is to let 1,2,3,... represent A, B, C, ... . The number 27 probably represents a space. Testing this hypothesis, we find the message GOODtLUCK. The second example has the same number of symbols as the first, and each is represented by a word with 5 bits. How is this word related to the corresponding number in the first example? 1.6 Suppose the coded message is x x x ... . There are two steps. Step 1 2 3 1: locate the spaces. Step 2: if x and x are consecutive spaces, switch the i j symbols x and x for k =1,2,...(j−i)/2. i+k j−k 1.7. s s and s s are both coded as 10010. 1 2 3 1 1.8. Yes,becausethecodedmessagecanbesplitintoblocksofthegivenlength, each of which is a codeword representing a unique symbol. 1.9. In both cases S is the 27-symbol alphabet A. In the first example T = {1,2,...,27}, and the coding function uses only strings of length 1. In the second example T =B, and the coding function S →B∗ is an injection into the subset B5 of B∗. 1.10Scontainsthe26lettersoftheEnglishalphabet,the10digits,anumber(at least6)ofaccentedletters,andanumber(atleast4)ofpunctuationmarks. The set T contains the dot and the dash, plus the three kinds of pauses mentioned in the text. (It is an interesting exercise to put the code-alphabet into a purely binary form.) 1.11. SOS; MAYDAY. 1.12. The code is −••••−−••, which is also the code for DEAD. 1.13. The number of ways of choosing 2 positions out of 8 is the binomial number (cid:0)8(cid:1)=(8×7)/2=28. Hence at most 28 symbols can be represented in 2 the semaphore code. 1.14. Buy, Sell, Sell. 1.15. Using words of length 2 there are only 4 possible codewords, so we need words of length 3, where we can choose any 6 of the 8 possibilities, say 17→001 27→010 37→011 47→100 57→101 67→110. With this code, if one bit in a codeword is wrong, then the result is likely to be another codeword: for example, if the first bit in 110 is wrong, we get the codeword 010. This problem cannot be avoided if we are restricted to using words of length 3. In order to overcome the problem we must use codewords withthepropertythatanytwodifferinatleasttwobits. Inthatcase,ifonebit 2 inanycodewordisinerror,thentheresultisnotacodeword,andtheerrorwill be detected. This can be arranged if we use codewords of length 4, for example 17→0000 27→1100 37→1010 47→1001 57→0110 67→0101. 1.16. MATHStIStGOODtFORtYOU. 1.17. No,becausethemessagereferstoENGLAND,whichdidnotexistinCaesar’s time. Also it is written in English. 1.18. Any permutation of the 26 letters can be used in place of the cyclic permutation in the Caesar system. Chapter 2 2.1. s s s s s s s s . 3 4 2 1 4 2 3 1 2.2. The codeword representing s is a prefix of the codeword representing s . 1 3 2.3. The new code is s 7→10, s 7→1, s 7→100. 1 2 3 Clearly it is not prefix-free, since 1 is a prefix of both 10 and 100. However, it can be decoded uniquely by noting that each codeword has the form 1 followed by a string of 0’s Alternatively decoding can be done by reading the codeword backwards. If the last bit is 1, the last symbol must be s . If it is 0, looking at 2 thelast-but-onebitenablesustodecideifthelastsymboliss ors . Repeating 1 3 the process the entire word can be decoded uniquely. For example 110101100 decodes as s s s s s . 2 1 1 2 3 2.4. Suppose the word q0 is a prefix of a codeword q. This implies that q0 is shorter than q. But the only occurrence of (cid:12) in q is the last symbol, so q0 cannot contain (cid:12) and is not a codeword. If the (cid:12) symbol is not used, then (for example) the codeword for D is a prefix of the codeword for B. 2.5. Thecodecanbeextendedbyaddingwordssuchas011,101,withoutlosing the PF property. 2.6. The code can be extended. 2.7. 128. 2.8. At this stage, the best way to attack this question is by experiment. If we decide to represent one outcome by a word of length one, say 6 7→ 0, then the PF condition means that only two words of length 2 are available, say 5 7→ 10, 4 7→ 11, and now we are stuck, because there are no more words available without violating the PF condition. We might use only one of the words of length 2, in which case we still get into difficulties with words of length 3 or 4. If we use no words of length 2, then there is a solution with n = 1,n = 0,n = 3,n = 2, and this has total word-length 18. If we reject 1 2 3 4 the use of a word of length 1, similar arguments lead to the possibility n =0, 1 n =2, n =4, where the total word-length is 16, for example 2 3 17→00, 27→01, 37→100, 47→101, 57→110, 67→111. This is the best possible. 3 2.9. (i) 00,101,011,100,101,1100,1101,1110; (ii) 0,100,101,1100,1101,1110, 11110,11111. 2.10. (i) We have 0 1 12 15 + + = ≤1, 3 32 33 27 and so, according to Theorem 2.9 a PF code exists. (ii) Here we have 0 1 12 40 85 + + + = >1, 3 32 33 34 81 so nothing can be said at this stage. (In fact, it is proved later that if K > 1 then a PF code cannot exist. 2.11. In part (i) take T ={0,1,2}; then the codewords could be 00 and any 12 of the 18 words 1∗∗, 2∗∗. 2.12. We have the equations n n n n +n +···+n =m, 1 + 2 +···+ M =1. 1 2 M b b2 bM Subtracting the second equation from the first (cid:18)b−1(cid:19) (cid:18)b2−1(cid:19) (cid:18)bM −1(cid:19) n +n + ··· +n = m−1. 1 b 2 b2 M bM Multiplying through by bM and noting that b − 1 is a divisor of bi − 1, we concludethatb−1isadivisorof(m−1)bM. Sincenodivisorofb−1candivide b, it follows that b−1 is a divisor of m−1. 2.13. The parameters n ,n ,n ,n must satisfy 1 2 3 4 n n n n n +n +n +n =12, 1 + 2 + 3 + 4 ≤1. 1 2 3 4 2 4 8 16 These equations imply that 7n +3n +n ≤4, so n =0 and n ≤1. Now it 1 2 3 1 2 is easy to make a list of the possibilities: n : 1 1 0 0 0 0 0 2 n : 1 0 4 3 2 1 0 . 3 n : 10 11 8 9 10 11 12 4 2.14. Q (x)=x+5x2, Q (x)=Q (x)2 =x2+10x3+25x4. 1 2 1 2.15. The coefficient of x4 in Q (x) is the number of S-words of length 2 that 2 are represented by T-words of length 4. These 25 words are all the words s s i j with i,j ∈{2,3,4,5,6}. 2.16. Q (x) = x2 + 2x3 + 4x4, Q (x) = x4 + 4x5 + 12x6 + 16x7 + 16x8, 1 2 Q (x)=x6+6x7+24x8+56x9+96x10+96x11+64x12. 3 2.17. Use the fact that Q (x)=Q (x)r. r 1 4 Chapter 3 3.1. Occurences of: a, about 60; ab, about 18. 3.2. No. 3.3. Thisisadeterministicsource. Theprobabilitydistributionassociatedwith each ξ is trivial. For example, Pr(ξ =3)=1, Pr(ξ =n)=0 for n6=3. k 4 4 3.4. If the symbols are s ,s ,s with probabilities 1/2,1/4,1/4, then the obvi- 1 2 3 ous code is s 7→0, s 7→10, s 7→11. This has average word-length L==1.5. 1 2 3 3.5. Suppose the word-lengths are x ≤ x ≤ x , so the average is L = 1 2 3 x α+x β+x (1−α−β). The KM condition is 1 2 3 1 1 1 + + ≤1. 2x1 2x2 2x3 The least possible value of x is 1. If also x = 1 there is no possible value 1 2 for x . If x = 2, we must have x = 2. So we get the ‘obvious’ solution 3 2 3 x∗ = 1,x∗ = 2,x∗ = 2, which gives L = 2−α. Any other solution must have 1 2 3 x ≥x∗,withatleastonestrictinequality,andsinceLisanincreasingfunction i i of x ,x ,x , the average word-length will also be greater. 1 2 3 3.6. H ≈2.24. If all symbols are equally likely H ≈2.32. 3.7. 0. 3.8. The expression for the entropy has m terms, each of them (1/m)log m. b Hence the entropy is log m. b 3.9. Start by proving that m X UH(u /U,u /U,...,u /U)=UlogU + u log(1/u ). 1 2 m i i i=1 3.10 For the sake of comparison we can use any base. Taking b = 2, the entropies are 1.295 and 1.485, approximately, so the second source has greater uncertainty. The uncertainty is greatest when all symbols are equally likely, that is, when the probability distribution is (1/3,1/3,1/3), and the entropy is log 3≈1.585. 2 3.11. h0(x)=(1/ln2)log((1−x)/x). This is zero when (1−x)/x=1, that is, x= 1. As x tends to 0 or 1, log((1−x)/x) tends to ±∞. 2 3.12. The result says that the uncertainty of the outcome is the sum of three components: the uncertainty as to whether the winner is a man or a woman; if the winner is man, the uncertainty as to which man; if the winner is a woman, the uncertainty as to which woman. 3.13. The SF rule says that the word-lengths are such that x is the least 1 integer such that 2x1 ≥ (1/0.25) = 4, that is, x1 = 2, and so on. The results are x =2, x =4, x =3, x =5, x =3, x =2. The average word-length is 1 2 3 4 5 6 2.7, and the entropy is H(p)≈2.42. 3.14. The SF rule gives word-lengths 1,2,2. But obviously the optimal code has all three codewords with length 1. 5 3.15. Theprobabilitiesateachstage areas follows (noattempthasbeenmade tomakethemincreasefromlefttoright). Thenewentryineachlineisinbold. 0.25 0.1 0.15 0.05 0.2 0.25 0.25 0.15 0.15 0.2 0.25 0.25 0.3 0.2 0.25 0.25 0.3 0.45 0.55 0.45 1.0 Using rule H2 gives the following codewords (minor variations are possible): 01, 0011, 000, 0010, 10, 11, with L=2.45. 3.16. L =2.4, L =1.9. SF opt 3.17. The entropy is 2.72 approximately. The SF rule gives codewords of lengths 3,3,3,4,4,4,4 with average word length L = 3.4. This satisfies the SF condition H(p)≤L <H(p)+1. SF The sequence of probability distributions generated by the Huffman rule is as follows (the probabilities have not been re-ordered on each line). 0.2 0.2 0.2 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.1 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.4 0.2 0.4 0.4 0.6 0.4 1.0 Using H2 the codewords are 00, 010, 011, 100, 101, 110, 111. (Several choices are possible, but all of them will produce a code with one word of length 2 and six of length 3.) The average word-length is L =2.8. opt 3.18. Codewords: 0, 10, 110, 1110, 11110, 11111. 3.19. Use Lemma 3.17. 3.20. Clearly ‘(C(N−2)) = 1+1 = 2. In the construction of C(N−i−1) from C(N−i), one word of maximum length m in C(N−i) is replaced by two words of length m+1, so the increase in total length is m+2. All words in C(N−i) have length at most i−1 so ‘(C(N−i−1))≤‘(C(N−i))+(i+1). Hence 1 ‘(C)≤2+3+···+N = (N2+N −2). 2 3.21. Using tree diagrams a few trials produces the code 0, 10, 11, 12, 20, 21, 22, which has average word-length 1.8. Since the probabilities are all multiples of 1/10, the average word length of any such code is a number of the form m/10. However the entropy with respect to encoding by a ternary alphabet is H (p)=H(p)/log 3≈1.72. Hence the word-length 1.8 must be optimal. 3 2 6 3.22. Letm=2r+s(0≤s≤2r−1). Thentheoptimalcode has2r−swords of length r and 2s words of length r+1. When m=400 this means 112 words of length 8 and 288 words of length 9. The average word-length is 2s L=r+ . 2r+s Chapter 4 4.1. k ≥r‘+‘−1. 4.2. AABACBBCCA. (Other solutions are possible.) 4.3. The entropy of the given distribution is approximately 0.72. For the obvious code A 7→ 0, B 7→ 1, clearly L = 1. For blocks of length 2 the 1 probabilitiesare0.64,0.16,0.16,0.04,andaHuffmancodeisAA7→0,AB 7→10, BA 7→ 110, BB 7→ 111. This has average word-length 1.56 and L /2 = 0.78. 2 For blocks of length 3 the probabilities and a Huffman code are AAA AAB ABA BAA ABB BAB BBA BBB 0.512 0.128 0.128 0.128 0.032 0.032 0.032 0.008 0 100 101 110 11100 11101 11110 11111. Thus L = 2.184 and L /3 = 0.728. This suggests that the limit of L /n as 3 3 n n→∞ is the entropy, 0.72 approximately. 4.4. No. 4.5. H(p)≈2.446, H(p0)≈0.971, H(p00)≈1.571. 4.6. The distribution p0 is [0.5,0.5], and p00 is the same. In both cases the entropy is 1. H(p)≈1.881<2, in agreement with Theorem 4.4. 4.7. p1 =[0.4,0.3,0.2,0.1]. Not memoryless. 4.8. The following distribution p3 on B3 induces the given p2, for any t in the range 0<t<0.2. 000 001 010 011 100 101 110 111 . t 0.2−t 0.1+t 0.2−t 0.2−t 0.1+t 0.2−t t Thus we can imagine a source that is not stationary, because p3 varies with t, but nevertheless complies with the conditions stated in the question. 4.9. H ≤H(p2)/2≈1.26. 4.10. See the proof of Lemma 4.10. 4.11. Use the hint given. 4.12. The number of codewords required is 262,000, approximately. 4.13. It is sufficient to consider the range 0 < x < 0.5, when the original distributionis[x2,x(1−x),x(1−x),(1−x)2]andthenumericalvaluesincrease the order given. The first step is to amalgamate x2 and x(1−x), giving the distribution[x,x(1−x),(1−x)2]. Sincex>x−x2 themiddletermisalwaysone of the two smallest. The other one is x if 0<x≤q and (1−x)2 if q <x≤0.5, √ whereq isthepointwherex=(1−x)2. Infact,q =(3− 5)/2≈0.382. Inthe 7 first case the word-lengths of the optimal code are 3,3,2,1, and in the second case they are 2,2,2,2. Hence L (x)=1+3x−x2 (0<x≤q), L (x)=2 (q <x≤0.5). 2 2 4.14. 29/64, 73/128. 4.15. In binary notation 1/3 is represented as .010101..., where the sequence 01repeatsforever. Ingeneralarationalnumberisrepresentedbyanexpansion that either terminates or repeats, depending on the base that is used. For example,therepresentationof1/3repeatsinbase2andbase10,butterminates in base 3. 4.16. The steps are as follows: X P a 1/P n n c c(X) P . ωα 0.02 0.92 50 6 7 118 1110110 4.17. It is enough to calculate the values of n , since the average word-length P isL =1+PPn ≈4.44. SinceH(p1)≈1.685;H(p2)/2≈1.495,theentropy 2 P does not exceed 1.495, whereas L /2≈2.22. 2 4.18. The codewords are 001, 0111, 100100, 1010, 11001, 111000, 111010, 1111011, 1111110, so the average word-length is 4.12. For the Huffman code (optimal for this source) the average word-length is 2.62. 4.19. For X = x x ...x x let X∗ = x x ...x α, where α is the first 1 2 r−1 r 1 2 r−1 elementofS. Theprobabilities P(Y)withY <X canbedividedintotwosets: those with Y <X∗ and those with X∗ ≤Y <X. 4.20. In the memoryless case, P(x x ...x ) is equal to the product of the 1 2 r−1 P(x ) (1 ≤ i ≤ r−1). Hence the right-hand side of the equation gives a(X) i in terms of a(X0), where X0 =x x ...x , and the values of P and a on the 1 2 r−1 individual symbols. Applying this equation recursively leads to the result. For example, a(βωγβ) can be calculated from the equations a(βωγβ))=a(βωγ)+P(β)P(ω)P(γ)a(β), a(βωγ)=a(βω)+P(β)P(ω)a(γ), a(βω)=a(β)+P(β)a(ω). 4.21. L ≈5.01. H =H(P)≈1.68<L /2. 2 2 4.22. (i) 110001; (ii) 01101101. 4.23. Check that the encoding rule creates dictionaries in Example 4.25. 4.24. Check that the decoding rule creates dictionaries as in Example 4.23. 4.25. The message begins BETtONtTENt... . 4.26. After Step 5 the decoded message is abacabca and the dictionary is D = a,b,c,ab,ba,ac,ca,abc. At Step 6 we have c = 7, which is decoded 5 6 as d = ca. The new dictionary entry has the form cax, where x must be 7 determined by considering c . The awkward case arises here, because c = 9 7 7 andthereisnoentryd inthecurrentdictionary. However,usingtheargument 9 given in the proof of Theorem 4.24, it follows that x must be the same as the first symbol in the new entry, that is x=c. 8 Chapter 5 5.1. (cid:18) (cid:19) 1−a a . b 1−b 5.2. The first row is (1−2x,x,x). 5.3. Let p be the input to Γ , q the output from Γ and input to Γ , r the 1 1 2 output from Γ . Then r=qΓ =pΓ Γ . Hence Γ is the matrix product Γ Γ . 2 2 1 2 1 2 5.4. The result is a BSC with bit-error probability e+e0−2ee0. 5.5. Since e = e +e−2ee it follows that e → 1 as n → ∞. When n+1 n n n 2 e=0, we have e =0 for all n, so the limit is zero. When e=1 the vlaue of e n n alternates between 0 and 1, so there is no limit. 5.6. q = [0.598, 0.402]. H(p) ≈ 0.9709, H(q) ≈ 0.9721. As we should expect, the uncertainty is increased (but only slightly in this case) by transmission through a noisy channel. 5.7. q =0.98p +0.04p , q =0.02p +0.96p . 0 0 1 1 0 1 5.8. Iftheinputdistributionis[p,1−p]itfollowsfromthedatathatp(1−a)+ (1−p)b = 0.5. Hence p = (1−2b)/2(1−a−b). The entropy of the output is h(0.5) = 1, whereas the entropy of the input is h(p) ≤ 1, with equality if and only if p = 0.5. This condition implies a = b, and we are given that a 6= b, so h(p)<1. 5.9. t = 0.594, t = 0.006, t = 0.004, t = 0.396. H(t) ≈ 1.0517 and (as 00 01 10 11 in Exercise 5.6) H(q)≈0.9721, hence H(Γ;p)≈0.0796. 5.10. h(p)+h(e)−h(q)≈0.9709+0.0808−0.9721=0.0796,asintheprevious exercise. 5.11. We have q = pΓ = [pc (1−p)c 1−c], from which it follows by direct calculationthatH(q)=h(c)+ch(p). Forthejointdistributiontthevaluesare pc,0,p(1−c) and 0,(1−p)c,(1−p)(1−c). Thus (by the argument used the proof of Theorem 5.9) H(t)=h(p)+h(c), and H(Γ;p) = H(t)−H(q)=h(p)+h(c)−h(c)−ch(p)=(1−c)h(p). 5.12. p =a+b, p =c+d, q =a+c, q =b+d. 0 1 0 1 (cid:18) (cid:19) a/(a+b) b/(a+b) Γ= . c/(c+d) d/(c+d) 5.13. (i) 1/27; (ii) 1; (iii) 3. 5.14. By Theorem 3.11 the maximum is log m at (1/m,1/m,...,1/m). 2 5.15. The channel matrix Γ is a 2N ×2 matrix with rows alternately 0 1 and 1 0. If the input source is [p ,p ,...,p ], the joint distribution t is given by 1 2 2N t =0, t =p , t =p , t =0, ... , t =p , t =0. 1,0 1,1 1 2,0 2 2,1 2N,0 2N 2N,1 Hence H(t)=H(p). It follows that H(p)−H(Γ;p) = H(p)−H(p|q) = H(p)−H(t)+H(q) = H(q). 9 So the maximum of H(p)−H(Γ;p) is the maximum of H(q), which is 1. This means that the channel can transmit one bit of information about any input: specifically, it tells the Receiver whether the input is odd or even. 5.16. The capacity is the maximum of H(q)−H(q|p). We have H(q)=h(q) and H(q | p) = pH(q | 0)+(1−p)H(q | 1), where H(q | 0) = h(a) and H(q | 1) = h(b). If x ,x satisfy the given equation then, since q = pΓ it 1 2 follows that ph(a)+(1−p)h(b) = qx +(1−q)x . 1 2 HencethefunctiontobemaximizedcanbewrittenintheformF(q), asstated. At the maximum, q is given by F0(q)=h0(q)−(x −x )=log((1−q)/q)−(x −x )=0. 1 2 1 2 So q =(2x1−x2 +1)−1, which gives the capacity as stated. 5.17. For each input x, H(q|x)=2αlog(1/α)+(1−2α)log(1/(1−2α))=k(α)say. HenceH(q|p)=k(α), whichisconstant. ThemaximumofH(q)occurswhen q=[1/4,1/4,1/4,1/4], and so the capacity is 2−k(α). P P 5.18. Since q = t it follows that ∆ = 1. Let S denote the sum j i ij i ij P q H(p|j); show that S+H(q)=H(t), as in the proof of Theorem 5.13. j j Chapter 6 6.1. One of the instructions S,E,W. 6.2. σ(101000)=111000, σ(101111)=111111, σ(100111)=000111. 6.3. Yes, because 100100 is more like 000000 than any other codeword. 6.4. The numbers are 0.9801, 0.0099, 0.0099, 0.0001. 6.5. The first column is (1−a)2, (1−a)b, (1−a)b, b2. 6.6. For the triangle inequality, note that if x can be converted to y by making α changes, and y can be converted to z by making β changes, then x can be converted to z by making α+β changes, at most. 6.7. c , c , c . 7 5 3 6.8. The possible received words z are: z =01000, z =10000, z =11100, z =11010, z =11001. 1 2 3 4 5 For the first two there are four codewords at distance 1, and for the rest there are three. 6.9. The nearest codewords for z are 11000,01100,01010,01001. So σ(z ) is 1 1 certainly not 10001, and this event has probability zero. The nearest code- words for z are 11000,10100,10010,10001. The rule is that one of them is 2 chosen as σ(z ) with probability 1/4, so the probability that z is received and 2 2 σ(z ) = 10001 is e/4. The nearest codewords for z and z do not include 2 3 4 10001, so (like z ) these contribute nothing. The nearest codewords for z are 1 5 10

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.