OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 4.3, Eighth Edition

16-bit Floating-Point Values

For signed 16-bit floating-point values, the minimum and maximum values that can be represented are (about) 6.103 × 10^–⁵, and 65504.0, respectively.

The following routine, F32toF16(), will convert a single, full-precision 32-bit floating-point value to a 16-bit reduced-precision form (stored as an unsigned short integer).

Click here to view code image

#define F16_EXPONENT_BITS 0x1F#define F16_EXPONENT_SHIFT 10#define F16_EXPONENT_BIAS 15#define F16_MANTISSA_BITS 0x3ff#define F16_MANTISSA_SHIFT (23 - F16_EXPONENT_SHIFT)#define F16_MAX_EXPONENT \(F16_EXPONENT_BITS << F16_EXPONENT_SHIFT)GLushortF32toF16(GLfloat val){ GLuint f32 = (*(GLuint *) &val); GLushort f16 = 0; /* Decode IEEE 754 little-endian 32-bit ...

Get OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 4.3, Eighth Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 4.3, Eighth Edition by

16-bit Floating-Point Values

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly