Untainting with Backreferences
Now
that we’ve removed all the
<
characters from the submitted guestbook data,
we’re ready to untaint that data so that Perl’s tainting
mechanism won’t cause the script to die when we write the new
entry out to the guestbook data file. Here, again, is the chunk of
code that does that untainting:
if ($sub{$_} =~ /^([^<]*)$/) { $sub{$_} = $1; # value is untainted now }
Looking carefully at that regular expression, the
/^([^<]*)$/
search pattern says “Try to
do a match in which we start at the very beginning of the string,
match a whole bunch of characters that are anything except
<
, and end up at the end of the string. And
while we’re at it, let’s save whatever gets matched in
$1
for later backreferencing.” Or, to put it
another way, this expression says “Match the whole string, but
only if the string has no <
characters in it.
If it has any <
characters, don’t match
anything.”
We can be reasonably sure this expression will
match because we previously used the substitution expression to
replace all the <
characters with
<
. Now we just take the captured string in
$1
and assign it back to
$sub{$_}
, and voilá, we’ve
laundered that particular
hash value, and Perl’s tainting mechanism no longer cares what
we do with it.
Tip
You’ll notice that Perl’s untainting mechanism
doesn’t actually stop us from doing insecure things. We could
always use an all-inclusive pattern like /^(.*)$/
to match a piece of tainted data, then assign whatever the old value was ...
Get Perl for Web Site Management now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.