ebook img

Information Transmission with Simbols of Different Cost: Course Held at the Department for Automation and Information June 1972 PDF

36 Pages·1972·2.017 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Information Transmission with Simbols of Different Cost: Course Held at the Department for Automation and Information June 1972

INTERNATIONAL CENTRE FOR MECHANICAL SCIENCES C 0 URS ES AN D L E C T li RES - No. 136 IMRE CSISZAR HUNGARJAN ACADEMY OF SCIENCES, BUDAPEST INFORMATION TRANSMISSION WITH SIMBOLS OF DIFFERENT COST COURSE HELD AT THE DEPARTMENT OF AUTOMATION AND INFORMATION ]UNE 197.2 SPRINGER-VERLAG WIEN GMBH 1972 Tbis wodt il suqect to copyright AU rights are reserved, whether the whole or part of the material il concemed specificaUy those of translation, reprinting, re-use of illustrations, broadcasting, reproduction by photocopying machine or similar means, and storage in data banks. © 1972 Springer-Verlag Wien Originally published by Springer-Verlag Wien-New York 1972 ISBN 978-3-211-81136-8 ISBN 978-3-7091-2866-4 (eBook) DOI 10.1007/978-3-7091-2866-4 PREFACE These notes represent the materiat of the author's teatures at the CISM's Summer aourses in Udine, 1972. The author is indebted to Prof. L. So brero, Searetary General of CISM, for having invited him to give these Zeatures and atso to Prof. G. Longo whose enthusiastia work in organizing this information theory aourse was a main faator of its suaaess. Udine, July 1972 A basic problern of information theory is the re liable transmission of messages with possibly small cost. The cost is most often measured by the number of symbols needed for the transmission (the length of the encoded message). It may hap pen, however, that the different symbols are of different cost ( e.g. different duration, as the Morse symbols dot, dash and space in telegraphy) in which case the problern of minimizing the aver age cost is different of minimizing the average number of symbols in the encoded message. This series of talks is devoted to prob lems of that kind. The results will very clearly demonstrate the fundamental role of Shannon1s entropy as the measure of the a mount of information. Moreover, our investigations will lead to significant consequences even when specialized to the simplest case of equal symbol costs, admittedly most often met in modern digital communication; this means that the theory to be develop ped may be of interest also for those who are not concerned with actually different symbol costs. Chapter 1. THE SIMPLEST CODING PROBLEM Let X be a (finite or countably infinite) set, • let Y be a finite set consisting of d elements, let Y denote the set of all finite sequences (strings) of letters of the al- phabet Y ( elements of the set Y ) • Let g = X-Y* b e an encoding of elements of X by (variable-length) code-words, having the prefix property. Suppose that probabilities p~) are associated with =1); the elements of X f p (:x:) ~0, ~ p(:c) then the average code \ :X:E')(. word length is (1.1) L = :.:~)( p (:x:) II g(:r:)ll !Iu II u where denotes the length of the string e: y*. It is known that (1.2) H =- E p(~) tog~p(:c) ~€)( and a prefix code exists with (1.3) L < H + 1 fog d 2 0) If the symbols ~4iY are of (possibly) different costs %(y) (~(1J)> then (1.1) is to be replaced by the average code word cost Information transmission ••• 7 n L = E p(x)~~(:x:)\ where ~(u) =.E ~('J~) if u =y . ... 'Jne:Y~ :X:E X ~ ') ~ .. i ( 1.4) We sketch a proof of (1.2) which aasily genera lizes to yield a similar lower bound for (1.4). For each ue:Y•, let p(u) be the probability of hav ing a code word beginning with the string u : p(u) = E p(:x:), ( 1. 5) x=u< g(:x:) p( u) and let ~I denote the ( condi tional) probabili ty that the next symbol will be ~ : = p(u y) (1.6) p(yju) p(u) . The average amount of infonnation given by the symbol following u is I(u) =- E p(y ju) eog2 P(yju). (1. 7) -y~x Intuition suggests and an easy calculation cheks that H =- E p(x) fog p(:x:)- E p(u) I(u), (1.8) 2 a:€X u 8 Chapter 1. The simplest coding problern where the last summation is for those U€Y•which areproper pre fixes of al least one codeword g(~). Since the average amount of information conveyed by one of d alternatives is upper bounded by eo92 d, from (1.8) we obtain ( 1.9) H ~ Eog d 1: p(u); 2 u but ~ p(u) = ~ P(~)llg(:.:)ll, thus (1.9) is equivalent to (1.2). u :uX' To obtain a similar result for (1.4), the average amount of infonnation should be upper bounded in terms of aver- age cost. Lemma 1. For an arbitrary probability assignement p(~), we have where ~ois the (unique) positive number satisfying ~ - %(~) = (1.11) ._ 1l1o 1. ~d Proof. From the convexi.ty of the function f(t) = t eo9t t easily follows that for arbitrary positive numbers a and b summing up to a and b , respectiv ely, the inequali ty (1.12) Infonnation transmission ••• 9 holdi. We shall refer to (1.12) as the basic ineguality. (To prove (1.12), applytheconvexityinequalitj" f(t)!! f{t0)+f~to)~-t0) a· a with f(t) = t eogzt to t = b~' to = b ' multiply both sides by b~ and swn up for ~ ) • -x.(lj) Applying the basic inequality to p(y) and UYo in the role of a~ and b~ , respectiv ely, we obtain %(~ E P(~) eog2p(y) W'o ~ 0' (1.13) lj6'{ which is equivalent to (1.10). Remark. The upper bound (1.10) is accurate in the sense that the equality obtains for a particular probability assignement (namely, for p ( ~) = 'UT 0 - !l'.(y)) • • Theorem 1. For any prefix code g =X-Y, the aver- age code word cost given by (1.4) is lower bounded by L ~ H c Proof. Applying lemma 1. to the conditional pro babilities P(~lu) , see (1.6), from (1.8) we obtain (1.15) = But the double sum on the right of (1.15) equals L ~~p(~)x~(~~ thus (1.15) is equivalent to (1.14). • Theorem 2. There exists a prefix code g =X-Y such 10 Chapter 1. The simplest coding problern that the average code word cost satisfies (1.16) H L < C + ~rnu Proof. A code with average cost satisfying {1.16) can be constructed by a simple modification of the Shannon-Fano method. Let the probabilities P(:t) be arranged into a non-in creasing sequence: P.. ~p2~ ••• ; let us divide the interval (0,1) into consecutive intervals of length p~, and the left endpoints of these subintervals should represent the corresponding :t e: X. Divide the interval f\10 ,1) into subinterval of length 'U1- 0 "(Y > , cf. (1.11), then divide each such subinterval containing at least two points representing different elements x~X into subintervals -%(y) of length proportional to ~o , etc. The code word of each ~E X is detennined by the sequence of subintervals containing the point representing x ; the length of the latter is clearly -%('jt} - ~(yt) - ~(yn) - %(9(~)) W'o t11o ••• Wo = to'o where g(:x:) = \ji ... yh • The length of the previous subinterval was greater than p~ = p(x)- else it would ~ve contained ·no point repr.esenting a different :.; -c~c9<"J:»-~c~n>) - which means 'lO' o ~ p~ = p (-r.) , and ewm. more -(~(g<x~- :Z.~t~a~t Wo ~ p~ (x). Takinglogaritbnrs, multiplyingby p(:x:)and summing up for all.&EXwe obtain -(L- !tirnax) eog1 'W'o ~ ~ p~)fog,p(:x:)=-H, 'XE X proving (1.16). Theorem 3. For block - to variable length encodings E of a discrete memoryless source of entropy rate H =- p(x) eog2 p(?:.), ~E)(

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.