Handling URLs Within a CGI Script

Credit: Jürgen Hermann

Problem

You need to build URLs within a CGI script—for example, to send an HTTP redirection header.

Solution

To build a URL within a script, you need information such as the hostname and script name. According to the CGI standard, the web server sets up a lot of useful information in the process environment of a script before it runs the script itself. In a Python script, we can access the process environment as os.environ, an attribute of the os module:

import os, string

def isSSL(  ):
    """ Return true if we are on an SSL (https) connection. """
    return os.environ.get('SSL_PROTOCOL', '') != ''

def getScriptname(  ):
    """ Return the scriptname part of the URL ("/path/to/my.cgi"). """
    return os.environ.get('SCRIPT_NAME', '')

def getPathinfo(  ):
    """ Return the remaining part of the URL. """
    pathinfo = os.environ.get('PATH_INFO', '')

    # Fix for a well-known bug in IIS/4.0
    if os.name == 'nt':
        scriptname = getScriptname(  )
        if string.find(pathinfo, scriptname) == 0:
            pathinfo = pathinfo[len(scriptname):]

    return pathinfo

def getQualifiedURL(uri = None):
    """ Return a full URL starting with schema, servername, and port.
        Specifying uri causes it to be appended to the server root URL (uri must
        start with a slash).
    """
    schema, stdport = (('http', '80'), ('https', '443'))[isSSL(  )]
    host = os.environ.get('HTTP_HOST', '')
    if not host:
        host = os.environ.get('SERVER_NAME', 'localhost')
        port = os.environ.get('SERVER_PORT', '80')
        if port != stdport: host = host + ":" + port

    result = "%s://%s" % (schema, host)
    if uri: result = result + uri

    return result

def getBaseURL(  ):
    """ Return a fully qualified URL to this script. """
    return getQualifiedURL(getScriptname(  ))

Discussion

There are, of course, many ways to manipulate URLs, but many CGI scripts have common needs. This recipe collects a few typical high-level functional needs for URL synthesis from within CGI scripts. You should never hardcode hostnames or absolute paths in your scripts, of course, because that would make it difficult to port the scripts elsewhere or rename a virtual host. The CGI environment has sufficient information available to avoid such hardcoding, and, by importing this recipe’s code as a module, you can avoid duplicating code in your scripts to collect and use that information in typical ways.

The recipe works by accessing information in os.environ, the attribute of Python’s standard os module that collects the process environment of the current process and lets your script access it as if it was a normal Python dictionary. In particular, os.environ has a get method, just like a normal dictionary does, that returns either the mapping for a given key or, if that key is missing, a default value that you supply in the call to get. This recipe performs all accesses through os.environ.get, thus ensuring sensible behavior even if the relevant environment variables have been left undefined by your web server (this should never happen, but, of course, not all web servers are bug-free).

Among the functions presented in this recipe, getQualifiedURL is the one you’ll use most often. It transforms a URI into a URL on the same host (and with the same schema) used by the CGI script that calls it. It gets the information from the environment variables HTTP_HOST, SERVER_NAME, and SERVER_PORT. Furthermore, it can handle secure (https) as well as normal (http) connections, and it selects between the two by using the isSSL function, which is also part of this recipe.

Suppose you need to redirect a visiting browser to another location on this same host. Here’s how you can use the functions in this recipe, hardcoding only the redirect location on the host itself, but not the hostname, port, and normal or secure schema:

# an example redirect header:
print "Location:", getQualifiedURL("/go/here")

See Also

Documentation of the standard library module os in the Library Reference; a basic introduction to the CGI protocol is available at http://hoohoo.ncsa.uiuc.edu/cgi/overview.html.

Get Python Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.