Encoding text

Text characters can be represented in different ways. For example, the Western alphabet can be encoded using Morse code, into a series of dots and dashes for transmission over a telegraph line.

In a similar way, text inside a computer is stored as bits; ones and zeros. .NET uses a standard called Unicode to encode text internally. Sometimes, you will need to move text outside .NET for use by systems that do not use Unicode or use a variation of Unicode. The following table shows some alternative encodings:

Encoding

Description

ASCII

Encodes a limited range of characters using the lower seven bits of a byte

UTF-8

Represents each Unicode code point as a sequence of one to four bytes

UTF-16

Represents each Unicode code point ...

Get C# 6 and .NET Core 1.0: Modern Cross-Platform Development now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.