UnicodeEncodeError and UnicodeDecodeError

UnicodeEncodeError and UnicodeDecodeError belong to the UnicodeError class (which itself is a subclass of ValueError). Unicode errors are an annoyance in Python 2.X that has largely been remedied by Python 3.X. We will encounter these errors either when encoding or decoding with different codecs. For example, if we parse some files that contain Unicode characters, we may need to encode that data to ASCII in order to process it properly with Python. Specifically, when encoding the data to ASCII, we can decide how we want to handle unsupported Unicode characters. We can perform a number of actions including ignoring and replacing unsupported characters as follows:

>>> unicode_str = u'\xe1\x93\x88my unicode ...

Get Learning Python for Forensics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.