More on HTML and URL Escapes
Perhaps the subtlest change in the last section’s
rewrite is that, for robustness, this version’s reply script (Example 16-23) also calls
cgi.escape
for the language
name, not just for the language’s code snippet.
This wasn’t required in languages2.py (Example 16-20) for the known
language names in our selection list table. However, it is not
impossible that someone could pass the script a language name with an
embedded HTML character as a query parameter. For example, a URL such
as:
http://localhost/cgi-bin/languages2reply.py?language=a<b
embeds a <
in the language
name parameter (the name is a<b
). When submitted, this version uses
cgi.escape
to properly translate
the <
for use in the reply HTML,
according to the standard HTML escape conventions discussed earlier:
<TITLE>Languages</TITLE> <H1>Syntax</H1><HR> <H3>a<b</H3><P><PRE> Sorry--I don't know that language </PRE></P><BR> <HR>
The original version doesn’t escape the language name, such that
the embedded <b
is interpreted
as an HTML tag (which may make the rest of the page render in bold
font!). As you can probably tell by now, text escapes are pervasive in
CGI scripting—even text that you may think is safe must generally be
escaped before being inserted into the HTML code in the reply
stream.
Because the Web is a text-based medium that combines multiple language syntaxes, multiple formatting rules may apply: one for URLs and another for HTML. We met HTML escapes earlier in this chapter; URLs, and ...
Get Programming Python, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.