Google App Engine is a web application hosting service. By “web application,” we mean an application or service accessed over the Web, usually with a web browser: storefronts with shopping carts, social networking sites, multiplayer games, mobile applications, survey applications, project management, collaboration, publishing, and all of the other things we’re discovering are good uses for the Web. App Engine can serve traditional website content too, such as documents and images, but the environment is especially designed for real-time dynamic applications.
In particular, Google App Engine is designed to host applications with many simultaneous users. When an application can serve many simultaneous users without degrading performance, we say it scales. Applications written for App Engine scale automatically. As more people use the application, App Engine allocates more resources for the application and manages the use of those resources. The application itself does not need to know anything about the resources it is using.
Unlike traditional web hosting or self-managed servers, with Google App Engine, you only pay for the resources you use. These resources are measured down to the gigabyte, with no monthly fees or up-front charges. Billed resources include CPU usage, storage per month, incoming and outgoing bandwidth, and several resources specific to App Engine services. To help you get started, every developer gets a certain amount of resources for free, enough for small applications with low traffic. Google estimates that with the free resources, an app can accommodate about 5 million page views a month.
App Engine can be described as three parts: the runtime environment, the datastore, and the scalable services. In this chapter, we’ll look at each of these parts at a high level. We’ll also discuss features of App Engine for deploying and managing web applications, and for building websites integrated with other Google offerings such as Google Apps and Google Accounts.
An App Engine application responds to web requests. A web request
begins when a client, typically a user’s web browser, contacts the
application with an HTTP request, such as to fetch a web page at a URL.
When App Engine receives the request, it identifies the application from
the domain name of the address, either an
subdomain (provided for free with every app) or a subdomain of a custom
domain name you have registered and set up with Google Apps. App Engine
selects a server from many possible servers to handle the request, making
its selection based on which server is most likely to provide a fast
response. It then calls the application with the content of the HTTP
request, receives the response data from the application, and returns the
response to the client.
From the application’s perspective, the runtime environment springs into existence when the request handler begins, and disappears when it ends. App Engine provides at least two methods for storing data that persists between requests (discussed later), but these mechanisms live outside of the runtime environment. By not retaining state in the runtime environment between requests—or at least, by not expecting that state will be retained between requests—App Engine can distribute traffic among as many servers as it needs to give every request the same treatment, regardless of how much traffic it is handling at one time.
Application code cannot access the server on which it is running in the traditional sense. An application can read its own files from the filesystem, but it cannot write to files, and it cannot read files that belong to other applications. An application can see environment variables set by App Engine, but manipulations of these variables do not necessarily persist between requests. An application cannot access the networking facilities of the server hardware, though it can perform networking operations using services.
In short, each request lives in its own “sandbox.” This allows App Engine to handle a request with the server that would, in its estimation, provide the fastest response. There is no way to guarantee that the same server hardware will handle two requests, even if the requests come from the same client and arrive relatively quickly.
Sandboxing also allows App Engine to run multiple applications on the same server without the behavior of one application affecting another. In addition to limiting access to the operating system, the runtime environment also limits the amount of clock time, CPU use, and memory a single request can take. App Engine keeps these limits flexible, and applies stricter limits to applications that use up more resources to protect shared resources from “runaway” applications.
A request has up to 30 seconds to return a response to the client. While that may seem like a comfortably large amount for a web app, App Engine is optimized for applications that respond in less than a second. Also, if an application uses many CPU cycles, App Engine may slow it down so the app isn’t hogging the processor on a machine serving multiple apps. A CPU-intensive request handler may take more clock time to complete than it would if it had exclusive use of the processor, and clock time may vary as App Engine detects patterns in CPU usage and allocates accordingly.
Google App Engine provides two possible runtime environments for applications: a Java environment and a Python environment. The environment you choose depends on the language and related technologies you want to use for developing the application.
The Python environment runs apps written in the Python 2.5 programming language, using a custom version of CPython, the official Python interpreter. App Engine invokes a Python app using CGI, a widely supported application interface standard. An application can use most of Python’s large and excellent standard library, as well as rich APIs and libraries for accessing services and modeling data. Many open source Python web application frameworks work with App Engine, such as Django, web2py, and Pylons, and App Engine even includes a simple framework of its own.
The Java and Python environments use the same application server model: a request is routed to an app server, the application is started on the app server (if necessary) and invoked to handle the request to produce a response, and the response is returned to the client. Each environment runs its interpreter (the JVM or the Python interpreter) with sandbox restrictions, such that any attempt to use a feature of the language or a library that would require access outside of the sandbox fails with an exception.
While using a different server for every request has advantages for scaling, it’s time-consuming to start up a new instance of the application for every request. App Engine mitigates startup costs by keeping the application in memory on an application server as long as possible and reusing servers intelligently. When a server needs to reclaim resources, it purges the least recently used app. All app servers have the runtime environment (JVM or Python interpreter) preloaded before the request reaches the server, so only the app itself needs to be loaded on a fresh server.
Applications can exploit the app caching behavior to cache data directly on the app server using global (static) variables. Since an app can be evicted between any two requests (and low-traffic apps are evicted frequently), and there is no guarantee that a given user’s requests will be handled by a given server, global variables are mostly useful for caching startup resources, like parsed configuration files.
I haven’t said anything about which operating system or hardware configuration App Engine uses. There are ways to figure out what operating system or hardware a server is using, but in the end it doesn’t matter: the runtime environment is an abstraction above the operating system that allows App Engine to manage resource allocation, computation, request handling, scaling, and load distribution without the application’s involvement. Features that typically require knowledge of the operating system are either provided by services outside of the runtime environment, provided or emulated using standard library calls, or restricted in logical ways within the definition of the sandbox.