Once you have parsed some HTML, you need to process it. Exactly what you do will depend on the nature of your problem. Two common models are extracting information and producing a transformed version of the HTML (for example, to remove banner advertisements).
Whether extracting or transforming, you'll probably want to find
the bits of the document you're interested in. They might be all
headings, all bold italic regions, or all paragraphs with
class="blinking". HTML::Element provides
several functions for searching the tree.
In scalar context, these methods return the first node that satisfies the criteria. In list context, all such nodes are returned. The methods can be called on the root of the tree or any node in it.
Return node(s) for tags of the names listed. For example,
to find all
@headings = $root->find_by_tag_name('h1', 'h2');
Returns the node(s) with the given attribute set to the
given value. For example, to find all nodes with
@blinkers = $root->find_by_attribute("class", "blinking");
These two methods search
$node and its children (and children's
children, and so on) in the case of
look_down, or its parent (and the
parent's parent, and so on) in the case of
look_up, looking for nodes that match
whatever criteria you specify. The parameters are either
value pairs (where ...