URL Encoding

All HTTP messages, with the possible exception of the content section of the message, use the ISO-8859-1 (ISO-Latin) character set. An HTTP request may include an Accept-Encoding request header that identifies alternative character encodings that are acceptable for the content in the HTTP response.

URLs pose a special challenge, because their syntax does not allow the following groups of characters from the ISO-Latin character set:

  • Non-ASCII characters— The ASCII characters are a subset of the entire ISO-Latin character set, consisting of the bottom half (characters 0-127) of the entire 256 characters. All non-ASCII characters (128-255) are invalid for use in URLs.

  • Non-printable ASCII characters (control characters)— Several ASCII ...

Get HTTP Developer’s Handbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.