Cover by Gary Pollice, Stanley Selkow, George T. Heineman

Safari, the world’s most comprehensive technology and business learning platform.

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required

O'Reilly logo

Floating-Point Computations

Computers are finite machines that have been designed to perform basic computations on values stored in registers by a Central Processing Unit (CPU). The size of these registers has evolved as computer architectures have grown from the popular 8-bit Intel processors from the 1970s to today's widespread acceptance of 64-bit architectures (such as Intel's Itanium and Sun Microsystems Sparc processor). The CPU often supports basic operations—such as ADD, MULT, DIVIDE, and SUB—over integer values stored within these registers. Floating Point Units (FPUs) can efficiently process floating-point computations according to the IEEE Standard for Binary Floating-Point Arithmetic (IEEE 754).

Computations over integer-based values (such as Booleans, 8-bit shorts, and 16- and 32-bit integers) have traditionally been the most efficient computations performed by the processor. Efficient programs that execute on computer architectures often take advantage of the performance differential between integer-based and floating point-based arithmetic. There are important issues that developers must be aware of when programming using floating-point arithmetic (Goldberg, 1991). Next we focus on the important issues that we consider in the algorithms and supporting code for this book.

Rounding Error

Any computation using floating-point values may introduce rounding errors because of the nature of the floating-point representation. In general, a floating-point number is a finite ...

Find the exact information you need to solve a problem on the fly, or go deeper to master the technologies and skills you need to succeed

Start Free Trial

No credit card required