String byte order

Extended character sets use more than one byte per character. If such characters are stored in a file, the order of the bytes becomes important. In this situation, the writer of the character must use the same order that will be used by potential readers.

One way to do this is to use a Byte Order Mark (BOM). This is a known number of bytes with a known pattern, and is typically placed as the first item in a stream so that the reader of the stream can use it to determine the byte order of the remaining characters in the stream. Unicode defines the 16-bit character, \uFEFF, and the non-character, \uFFFE, as byte order marks. In the case of \uFEFF, all bits are set except for bit 8 (if the lowest bit is labeled as bit 0). This ...

Get Beginning C++ Programming now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.