The formatter Module
The formatter
module provides formatter classes that can be used together with
the htmllib
module.
This module provides two class families, formatters and writers. Formatters convert a stream of tags and data strings from the HTML parser into an event stream suitable for an output device, and writers render that event stream on an output device. Example 5-13 demonstrates.
In most cases, you can use the
AbstractFormatter
class to do the formatting.
It calls methods on the writer object, representing different kinds of
formatting events. The AbstractWriter
class
simply prints a message for each method call.
Example 5-13. Using the formatter Module to Convert HTML to an Event Stream
File: formatter-example-1.py import formatter import htmllib w = formatter.AbstractWriter() f = formatter.AbstractFormatter(w) file = open("samples/sample.htm") p = htmllib.HTMLParser(f) p.feed(file.read()) p.close() file.close()send_paragraph(1)
new_font(('h1', 0, 1, 0))
send_flowing_data('A Chapter.')
send_line_break()
send_paragraph(1)
new_font(None)
send_flowing_data('Some text. Some more text. Some')
send_flowing_data(' ')
new_font((None, 1, None, None))
send_flowing_data('emphasized')
new_font(None)
send_flowing_data(' text. A')
send_flowing_data(' link')
send_flowing_data('[1]')
send_flowing_data('.'
In addition to the AbstractWriter
class, the
formatter
module provides a
NullWriter
class, which ignores all events
passed to it, and a DumbWriter
class that converts the ...
Get Python Standard Library now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.