Source Code Encoding

Python source programs are normally written in standard 7-bit ASCII. However, users working in Unicode environments may find this awkward—especially if they must write a lot of string literals.

It is possible to write Python source code in a different encoding by including a special comment in the first or second line of a Python program:

#!/usr/bin/env python
# -*- coding: UTF-8 -*-

name = u'M\303\274ller'  # String in quotes is directly encoded in UTF-8.

When the special coding: comment is supplied, Unicode string literals may be specified directly in the specified encoding (using a Unicode-aware editor program for instance). However, other elements of Python, including identifier names and reserved words, are still restricted ...

Get Python: Essential Reference, Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.