HTML::FormatText

The HTML::FormatText module takes a parsed HTML file and outputs a plain-text version of it. None of the character attributes will be usable, i.e., bold or italic fonts, font sizes, etc.

This module is similar to FormatPS in that the constructor takes attributes for formatting, and the format method produces the output. A formatter object can be constructed like this:

$formatter = HTML::FormatText->new(leftmargin => 10, rightmargin => 80);

The constructor can take two parameters: leftmargin and rightmargin. The value for the margins is given in column numbers. The aliases lm and rm can also be used.

The format method takes an HTML::TreeBuilder object and returns a scalar containing the formatted text. You can print it with:

print $formatter->format($html);

Get Perl in a Nutshell, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.