Suppose that the output of our above rewriter is not satisfactory. While its output contains an apparently harmless one-cell one-row table, this is somehow troublesome when the president of the company tries viewing that web page on his cellphone/PDA, which has a typically limited understanding of HTML. Some experimentation shows that any web pages with tables in them will deeply confuse the boss's PDA.
So your task should be changed to this: find the one interesting
cell in the table (the
class="story"), detach it, then
replace the table with the
delete the table. This is a complex series of actions, but luckily every
one of them is directly translatable into an HTML::Element method.
The result is Example
Example 10-2. Detaching and reattaching nodes
use strict; use HTML::TreeBuilder; my $root = HTML::TreeBuilder->new; $root->parse_file('rewriters1/in002.html') || die $!; my $good_td = $root->look_down( '_tag', 'td', 'class', 'story', ); die "No good td?!" unless $good_td; # sanity checking my $big_table = $root->look_down( '_tag', 'table' ); die "No big table?!" unless $big_table; # sanity checking $good_td->detach; $big_table->replace_with($good_td); # Yes, there's even a method for replacing one node with another! open(OUT, ">rewriters1/out002b.html") || die "Can't write: $!"; print OUT $root->as_HTML(undef, ' '); # two-space indent in output close(OUT); $root->delete; # done with it, so delete it
The resulting document looks like ...