Chapter 33. Floating-Point Numbers Arenât Real
FLOATING-POINT NUMBERS ARE NOT âREAL NUMBERSâ in the mathematical sense, even though they are called real in some programming languages, such as Pascal and Fortran. Real numbers have infinite precision and are therefore continuous and nonlossy; floating-point numbers have limited precision, so they are finite, and they resemble âbadly behavedâ integers, because theyâre not evenly spaced throughout their range.
To illustrate, assign 2147483647 (the largest signed 32-bit integer) to a 32-bit
float
variable (x
, say), and print it. Youâll see 2147483648. Now print x
-64. Still 2147483648. Now print x
-65, and youâll get 2147483520! Why? Because the spacing between
adjacent floats in that range is 128, and floating-point operations round to the nearest
floating-point number.
IEEE floating-point numbers are fixed-precision numbers based on base-two
scientific notation: 1.d1d2â¦dp
1 x 2e, where p is the
precision (24 for float
, 53 for double
). The spacing between two consecutive numbers is
21-p+e, which can be safely approximated by ε|x|, where ε
is the machine epsilon (21-p).
Knowing the spacing in the neighborhood of a floating-point number can help you avoid classic numerical blunders. For example, if youâre performing an iterative calculation, such as searching for the root of an equation, thereâs no sense ...
Get 97 Things Every Programmer Should Know now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.