O'Reilly logo

Using Google App Engine by Charles Severance

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Looking at All the Data Available on an HTTP Request

There is a lot of data available to our program in addition to the POST data. The App Engine environment makes this data available to our application for each incoming request so that we can do things differently based on this information.

The environment variables fall into three categories:

  • Variables describing the server environment (SERVER_SOFTWARE or SERVER_NAME)

  • Variables describing the request data (REQUEST_METHOD, HTTP_USER_AGENT, or CONTENT_TYPE)

  • Variables describing the browser environment variables (HTTP_USER_AGENT, HTTP_ACCEPT, and so on)

You can find documentation about these parameters at http://hoohoo.ncsa.uiuc.edu/cgi/in.html. This is a very old website that describes the Common Gateway Interface (CGI), which was the way that the very first web servers passed input data from an HTTP request into application code running on the server.

We will write an application that reads and dumps out all the information available to our application. We call this the “dumper” program because it just looks at its input and dumps it out.

The dumper program consists of a very simple app.yaml file and a single index.py Python file that contains the complete code of our App Engine program.

The app.yaml file names our application and routes all incoming requests to the index.py script as before:

application: ae-02-dumper
version: 1
runtime: python
api_version: 1

handlers:
- url: /.*
  script: index.py

The logic for our dumper program is completely contained in the index.py file:

import os
import sys

print 'Content-Type: text/html'
print ''
print '<form method="post" action="/" >'
print 'Zap Data: <input type="text" name="zap"><br/>'
print 'Zot Data: <input type="text" name="zot"><br/>'
print '<input type="submit">'
print '</form>'

print '<pre>'
print 'Environment keys:'
print ''
for param in os.environ.keys():
    print param, ':', os.environ[param]
print ''

print 'Data'
count = 0
for line in sys.stdin:
  count = count + 1
  print line
  if count > 100:
    break

print '</pre>'

Let’s walk through each of the parts of the dumper program’s index.py.

The first print sends the HTTP response headers, followed by a blank line to indicate the start of the HTML document:

print 'Content-Type: text/html'
print ''

When you select View Source on an HTML page, you are not shown the response headers because they are not part of the HTML document. However, with the Firebug plug-in, you can see the HTTP response headers under the Net tab, as shown in Figure 4-12.

The next set of print statements produces the HTML for a form that we use to send some more complex POST data to our program. This form now has two input text areas named zap and zot, so we can see what happens with multiple input areas:

print '<form method="post" action="/" >'
print 'Zap Data: <input type="text" name="zap"><br/>'
print 'Zot Data: <input type="text" name="zot"><br/>'
print '<input type="submit">'
print '</form>'
HTTP headers on a response

Figure 4-12. HTTP headers on a response

The form is quite basic, with two text fields and a Submit button. The next lines of the program read in a set of variables passed to our program as a Python dictionary. These are the environment or CGI variables. They are a combination of the server configuration as well as information about the particular request itself.

We iterate through the keys in the dictionary and then print the keys and values, separated by a colon character, using a Python for loop:

print '<pre>'
print 'Environment keys:'
print ''
for param in os.environ.keys():
    print param, ':', os.environ[param]
print ''

The output from this section is as follows:

HTTP_REFERER : http://www.appenginelearn.com/
SERVER_SOFTWARE : Development/1.0
SCRIPT_NAME :
REQUEST_METHOD : GET
HTTP_KEEP_ALIVE : 300
SERVER_PROTOCOL : HTTP/1.0
QUERY_STRING :
CONTENT_LENGTH :
HTTP_ACCEPT_CHARSET : ISO-8859-1,utf-8;q=0.7,*;q=0.7
HTTP_USER_AGENT : Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.5) 
Gecko/2008120121 Firefox/3.0.5
HTTP_CONNECTION : keep-alive
SERVER_NAME : localhost
REMOTE_ADDR : 127.0.0.1
PATH_TRANSLATED : /Users/csev/Desktop/teach/appengine/apps/ae-02-dumper/index.py
SERVER_PORT : 8080
AUTH_DOMAIN : gmail.com
CURRENT_VERSION_ID : 1.1
HTTP_HOST : localhost:8080
TZ : UTC
USER_EMAIL :
HTTP_ACCEPT : text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
APPLICATION_ID : ae-02-dumper
GATEWAY_INTERFACE : CGI/1.1
HTTP_ACCEPT_LANGUAGE : en-us,en;q=0.5
CONTENT_TYPE : application/x-www-form-urlencoded
HTTP_ACCEPT_ENCODING : gzip,deflate
PATH_INFO : /

Data

Because this is an HTTP GET request, there is no data to print.

You can consult the CGI documentation for the details on each of the previously mentioned variables: http://hoohoo.ncsa.uiuc.edu/cgi/in.html.

When we are programming at the CGI level, we are using the old mystical ways of the early web server programs. We won’t use this pattern for much longer, but it is good to start by understanding the low-level details and then delegate the handling of those details to a web framework.

The last part of the index.py program dumps out up to the first 100 lines of POST data, if the data exists:

print 'Data'
count = 0
for line in sys.stdin:
  count = count + 1
  print line
  if count > 100:
    break

According to CGI rules, the POST data is presented to the application via its standard input. In Python, we can read through the predefined file handle sys.stdin to access our POST data using a Python for loop.

If you look at the bottom of the initial output of the program, you will see that there is no POST data because when you navigate to http://localhost:8080, the browser issues an HTTP GET request for the initial document (/).

To test POST data dumping code, we must enter some data into the Zap and Zot input fields and click the Submit button, as shown in Figure 4-13.

Entering form data

Figure 4-13. Entering form data

When we click Submit, our browser sends a POST request, which we can immediately see in the REQUEST_METHOD variable change from GET to POST:

Environment keys:

HTTP_REFERER : http://www.appenginelearn.com/
SERVER_SOFTWARE : Development/1.0
SCRIPT_NAME :
REQUEST_METHOD : POST
HTTP_KEEP_ALIVE : 300
SERVER_PROTOCOL : HTTP/1.0
QUERY_STRING :
  ...

And if we scroll down to the bottom of the output, we can see the actual POST data:

GATEWAY_INTERFACE : CGI/1.1
HTTP_ACCEPT_LANGUAGE : en-us,en;q=0.5
CONTENT_TYPE : application/x-www-form-urlencoded
HTTP_ACCEPT_ENCODING : gzip,deflate
PATH_INFO : /

Data
zap=Some+Data&zot=Some+More+Data

To make parsing easier, the POST data is encoded by escaping spaces and special characters. Each parameter starts with an ampersand (&) to distinguish its new parameters from the data of the previous parameter. To make sense of this input, we would have to parse the input data using string parsing and then unescape the data to get back to the actual data that was typed into the form.

There are two ways to encode POST data. The easy way to encode the POST data is called “application/x-www-form-urlencoded”; this approach concatenates all the data into a single line of input as shown earlier. The more complex way to encode POST data is called multipart/form-data and is described in the next section.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required