HTMLParser

The HTMLParser module defines a class, HTMLParser, that can be used to parse HTML and XHTML documents. To use this module, you define your own class that inherits from HTMLParser and redefines methods as appropriate.

HTMLParser()

This is a base class that is used to create HTML parsers. It is initialized without any arguments.

An instance, h, of HTMLParser has the following methods:

h.close()

Closes the parser and forces the processing of any remaining unparsed data. This method is called after all HTML data has been fed to the parser.

h.feed(data)

Supplies new data to the parser. This data will be immediately parsed. However, if the data is incomplete (for example, it ends with an incomplete HTML element), the incomplete portion will ...

Get Python: Essential Reference, Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.