16.1 FLOATING-POINT SYSTEM DEFINITION
Assume that a set of real numbers x belonging to the interval
is represented in such a way that the following specifications are satisfied:
d1 is the maximum distance between small exactly-represented non zero numbers;
d2 is the maximum distance between large exactly-represented numbers;
xmin is the maximum distance between 0 and the smallest exactly-represented numbers:
where the adjectives small and large refer to the absolute value of the corresponding numbers.
Every number x will be represented in the form ±s.be, with b ≥ 2, s being the significand and e the exponent.
In order to make the implementation of the arithmetic operations easier (Section 16.2), the two following conditions must be satisfied:
- The significand s is represented in base B = b.
- The significand belongs to the interval
Thus x is expressed in the form
The values of p, emin, and emax are chosen in such a way that
Example 16.1 Define a floating-point representation system ...
Get Synthesis of Arithmetic Circuits: FPGA, ASIC and Embedded Systems now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.