ebook img

FLOATING-POINT ARITHMETIC • Floating-point representation and PDF

74 Pages·2003·0.2 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview FLOATING-POINT ARITHMETIC • Floating-point representation and

1 FLOATING-POINT ARITHMETIC (cid:15) Floating-point representation and dynamic range (cid:15) Normalized/unnormalized formats (cid:15) Values represented and their distribution (cid:15) Choice of base (cid:15) Representation of signi(cid:12)cand and of exponent (cid:15) Rounding modes and error analysis (cid:15) IEEE Standard 754 (cid:15) Algorithms and implementations: addition/subtraction, multiplication and division Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic 2 VALUES REPRESENTED IN FLPT SYSTEM A B C D E - inf b d + inf [A, B] - negative floating-point numbers (normalized) [D,E] - positive floating-point numbers (normalized) (B,b] & [d,D) - denormals C - zero > E - positive overflow < A - negative overflow (B, C) - negative underflow (normalized) (C, D) - positive underflow (normalized) (a) significand 0.100 0.101 0.110 0.111 denormals 0 1/8 1/4 1/2 1 2 exponent -2 -1 0 1 (b) Figure 8.1: a) Regions in (cid:13)oating-point representation. b) Example for m = f = 3, r = 2, and (cid:0)2 (cid:20) E (cid:20) 1 (only positive region). Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic 3 Floating-point system Normalized Unnormalized A (cid:0)(rm(cid:0)f (cid:0) r(cid:0)f) (cid:2) bEmax B (cid:0)rm(cid:0)f(cid:0)1 (cid:2) bEmin (cid:0)r(cid:0)f (cid:2) bEmin C 0 D rm(cid:0)f(cid:0)1 (cid:2) bEmin r(cid:0)f (cid:2) bEmin E (rm(cid:0)f (cid:0) r(cid:0)f) (cid:2) bEmax Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic 4 DISTRIBUTION FOR b = 2, m = f = 4, and e = 2 Signi(cid:12)cand 2E 1 2 4 8 0.1000 1/2 1 2 4 0.1001 9/16 9/8 9/4 9/2 0.1010 10/16 10/8 10/4 5 0.1011 11/16 11/8 11/4 11/2 0.1100 12/16 12/8 3 6 0.1101 13/16 13/8 13/4 13/2 0.1110 14/16 14/8 14/4 7 0.1111 15/16 15/8 15/4 15/2 Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic 5 DISTRIBUTION FOR b = 2, m = f = 3, and e = 3 Signi(cid:12)cand 2E 1 2 4 8 16 32 64 128 0.100 1/2 1 2 4 8 16 32 64 0.101 5/8 5/4 5/2 5 10 20 40 80 0.110 6/8 3/2 3 6 12 24 48 96 0.111 7/8 7/4 7/2 7 14 28 56 112 Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic 6 DISTRIBUTION FOR b = 4, m = f = 4, and e = 2 Signi(cid:12)cand 4E 1 4 16 64 0.0100 1/4 1 4 16 0.0101 5/16 5/4 5 20 0.0110 6/16 6/4 6 24 0.0111 7/16 7/4 7 28 0.1000 1/2 2 8 32 0.1001 9/16 9/4 9 36 0.1010 10/16 10/4 10 40 0.1011 11/16 11/4 11 44 0.1100 12/16 3 12 48 0.1101 13/16 13/4 13 52 0.1110 14/16 14/4 14 56 0.1111 15/16 15/4 15 60 Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic 7 DISTRIBUTION OF FLPT NUMBERS (a) b=2, f=4, e=2 E: 1 2 4 8 0 1/2 1 2 3 4 5 6 7 (b) b=2, f=3, e=3 E: 1 2 4 8 16, 32, 64, 128 0 1/2 1 2 3 4 5 6 7 8 ,10,12,14,16,20,24,28, 32,40,48,56, 64,80,96,112 (c) b=4, f=4, e=2 E: 1 4 16, 64 01/41/2 1 2 3 4 5 6 7 8 , 9, ..., 16, 20, 24, ...,60 Figure 8.2: EXAMPLES OF DISTRIBUTIONS OF FLOATING-POINT NUMBERS. Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic 8 REPRESENTATION OF SIGNIFICAND AND EXPONENT (cid:15) SIGNIFICAND: SM with HIDDEN BIT (cid:15) EXPONENT: BIASED E = E + B, minE = 0 ) B = (cid:0)E R R min (cid:15) Symmetric range (cid:0)B (cid:20) E (cid:20) B ) 0 (cid:20) E (cid:20) 2B (cid:20) 2e (cid:0) 1 R (cid:15) for 8-bit exponent: B = 127, (cid:0)127 (cid:20) E (cid:20) 128, 0 (cid:20) E (cid:20) 255 R (cid:15) E = 255 not used R (cid:15) SIMPLIFIES COMPARISON OF FLOATING-POINT NUMBERS (same as in (cid:12)xed-point) (cid:15) MINIMUM EXPONENT REPRESENTED BY 0 SO THAT FLOATING-POINT VALUE 0: ALL ZEROS (0 sign, 0 exponent, 0 signi(cid:12)cand) Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic 9 SPECIAL VALUES AND EXCEPTIONS (cid:15) Special values - not representable in the FLPT system { NAN (Not A Number) { In(cid:12)nity (pos, neg) { allow computation in presence of special values (cid:15) Exceptions: result produced not representable - set a (cid:13)ag { Exponent over(cid:13)ow { Under(cid:13)ow Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic 10 ROUNDOFF MODES AND ERROR ANALYSIS (cid:15) Exact results (inf. precision): x, y, etc. (cid:15) FLPT number representing x is R (x) with rounding mode mode mode (cid:15) Basic relations: 1. If x (cid:20) y then R (x) (cid:20) R (y) mode mode 2. If x is a FLPT number then R (x) = x mode 3. If F1 and F2 are two consecutive FLPT numbers then for F1 (cid:20) x (cid:20) F2 x is either F1 or F2 F1 F2 x Figure 8.3: Relation between x, Rmode(x), and (cid:13)oating-point numbers F1 and F2. Digital Arithmetic - Ercegovac/Lang 2003 8 { Floating-Point Arithmetic

Description:
Figure 8.1: a) Regions in floating-point representation. b) Example for m = f = 3, r = 2, and −2 ≤ E ≤ 1 (only positive region). Digital Arithmetic - Ercegovac/Lang
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.