A JSF-based application runs as a web application in a Java web container. While the Java Servlet API dependencies are limited to a few classes, and a simple JSF application never needs to be exposed to it, the main concepts defined by the Servlet specification must be understood to develop and deploy a JSF application. In addition, JavaServer Pages (JSP) technology—which is based on the Servlet API—is often used as the presentation layer in a JSF application.
This chapter is a brief introduction to the Hypertext Transport Protocol (HTTP), servlets, and JSP, focusing on the areas that are important for a JSF application. This chapter contains the bare minimum you need to know to understand the rest of this book. I recommend that you read books dedicated to the subjects of servlets and JSP before you embark on a real development project. Two books I can recommend are Jason Hunter's and William Crawford's Java Servlet Programming (O'Reilly) and my own JavaServer Pages (O'Reilly). If you're already familiar with these technologies, you can safely skip this chapter.
The Hypertext Transport Protocol (HTTP) is the lingua franca of the web. In order to develop any type of web application, you must understand at least the basics of this protocol. Before we dig into servlets and JSP, let's see what HTTP is all about.
HTTP is based on a very simple communications model. Here's how it works: a client, typically a web browser, sends a request for a resource to a server, and the server sends back a response corresponding to the resource (or a response with an error message if it can't process the request for some reason). A resource can be a number of things, such as a simple HTML file returned verbatim to the browser or a program that generates the response dynamically. The request/response model is illustrated in Figure 4-1.
This simple model implies three important facts you must be aware of:
HTTP is a stateless protocol; the server doesn't keep any information about the client after it sends its response, and therefore can't recognize that multiple requests from the same client may be related.
Web applications can't easily provide the kind of immediate feedback typically found in standalone GUI applications, such as word processors or traditional client/server applications. Every interaction between the client and the server requires a request/response exchange. Performing a request/response exchange when a user selects an item in a list box or fills out a form element is usually too taxing on the bandwidth available to most Internet users.
There's nothing in the protocol that tells the server how a request is made; consequently, the server can't distinguish between various methods of triggering the request on the client. For example, the server can't differentiate between an explicit request caused by clicking a link or submitting a form and an implicit request caused by resizing the browser window or using the browser's Back button. In addition, HTTP doesn't provide any means for the server to invoke client specific functions, such as going back in the browser history list or sending the response to a certain frame. Also, the server can't detect when the user closes the browser.
Let's take a closer look at requests. A user sends a request to the server by clicking a link on a web page, submitting a form, or typing in a web page address in the browser's address field. To send a request, the browser must know which server to talk to and which resource to ask for. This information is specified as a uniform resource locator (URL):
The first part of the URL shown here specifies that the request be made using the HTTP protocol. This is followed by the name of the server, in this case www.gefionsoftware.com. The web server waits for requests to come in on a specific TCP/IP port. Port number 80 is the standard port for HTTP requests. If the web server uses another port, the URL must specify the port number in addition to the server name. For example:
This request is sent to a server that uses port 8080 instead of 80. The last part of the URL, /index.html, identifies the resource that the client is requesting.
A URL is actually a specialization of a
identifier (URI, defined in the RFC
2396 specification). A URL identifies a resource partly by its
location, for instance, the server that contains the resource.
Another type of URI is a uniform
a globally unique identifier that is
valid no matter where the resource is located. HTTP deals only with
the URL variety. The terms URI and URL are often used
interchangeably, but unfortunately they have slightly different
definitions in different specifications. I'm trying
to use the terms as defined by the HTTP/1.1 specification (RFC 2616),
which is pretty close to how they are mostly used in the servlet,
JSP, and JSF specifications. Hence, I use the term URL only when the
URI must start with
https, for HTTP over an encrypted connection)
followed by a server name and possibly a port number, as in the
previous examples. I use URI as a generic term for any string that
identifies a resource, where the location can be deduced from the
context and isn't necessarily part of the URI. For
example, when the request has been delivered to the server, the
location is a given, and only the resource identifier is important.
The browser uses the URL information to create the request message and send it to the specified server using the specified protocol. An HTTP request message consists of three things: a request line, some request headers, and possibly a request body.
The request line starts with the request method name, followed by a resource identifier and the protocol version used by the browser:
GET /index.html HTTP/1.1
The most commonly used request
method is named
GET. As the name implies, a
request is used to retrieve a resource from the server. If you type a
URL in the browser's address field, or click on a
link, the request is sent as a
GET request to the
Here's an example of a valid HTTP request message:
GET /index.html HTTP/1.1 Host: www.gefionsoftware.com User-Agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv: 1.0.2) Accept: image/gif, image/jpeg, image/pjpeg, image/png, */* Accept-Language : en Accept-Charset : iso-8859-1,*,utf-8
The request line specifies the
GET method and asks
for the resource named /index.html to be
returned using the HTTP/1.1 protocol version. The various headers
provide additional information.
Host header tells the
hostname used in the URL. A server may have multiple names, so this
information is used to distinguish between multiple virtual web
servers sharing the same web server process.
information about the type of browser making the request. The server
can use this to send different types of responses to different types
of browsers. For instance, if the server knows whether Internet
Explorer or Netscape Navigator is used, it can send a response that
takes advantage of each browser's unique features.
It can also tell if a client other than an HTML browser is being
used, such as a Wireless Markup Language (WML) browser on a cell
phone or a PDA device, and generate an appropriate response.
provide information about the languages and file formats the browser
accepts. These headers can be used to adjust the response to the
capabilities of the browser and the user's
preferences, such as using a supported image format and the
user's preferred language. These are just a few of
the headers that can be included in a request message. The HTTP
specification, available at http://www.w3c.org/, describes all of them.
The URI doesn't necessarily correspond to a static file on the server. It can identify an executable program, a record in a database, or pretty much anything the web server knows about. That's why the generic term resource is used. In fact, there's no way to tell if the /index.html URI corresponds to a file or something else; it's just a name that means something to the server. The web server is configured to map these unique names to the real resources.
When the web server receives the request, it looks at the URI and decides how to handle it based on configuration information. It may handle the request internally by simply reading an HTML file from the filesystem, or it may forward the request to some component that is responsible for the resource corresponding to the URI. This can be a program that uses database information, for instance, to dynamically generate an appropriate response. To the browser it makes no difference how the request is handled; all it cares about is getting a response.
The response message looks similar to the request message. It consists of three things: a status line, some response headers, and an optional response body. Here's an example:
HTTP/1.1 200 OK Last-Modified: Mon, 20 Dec 2002 23:26:42 GMT Date: Mon, 16 Jun 2003 20:52:40 GMT Status: 200 Content-Type: text/html Servlet-Engine: Tomcat Web Server/5.0 Content-Length: 59 <html> <body> <h1>Hello World!</h1> </body> </html>
The status line starts with the name of the protocol, followed by a
status code and a short description of the status code. Here the
status code is 200, meaning the request was executed successfully.
The response message has headers just like the request message. In
this example, the
Last-Modified header gives the
date and time for when the resource was last modified. The browser
can use this information as a timestamp in a local cache; the next
time the user asks for this resource, the browser can ask the server
to send it only if it's been updated since the last
time it was requested. The
tells the browser what type of response
data the body contains, and the
header how large it is. The other headers are self-explanatory. A
blank line separates the headers from the message body. Here the body
is a simple HTML page:
<html> <body> <h1>Hello World!</h1> </body> </html>
Of course, the body can contain a more complex HTML page or any other
type of content. For example, the request may return an HTML page
<img> elements. When the browser reads
the first response and finds the
elements, it sends a new request for the resource identified by each
element, often in parallel. The server returns one response for each
image request, with a
Content-Type header telling
what type of image it is (for instance,
and the body containing the bytes that make up the image. The browser
then combines all responses to render the complete page. This
interaction is illustrated in
Besides the URI and headers, a request message can contain additional information in the form of parameters. If the URI identifies a server-side program for displaying weather information, for example, request parameters can provide information about which city the user wants to see a forecast for. In an e-commerce application, the URI may identify a program that processes orders, using the customer number and the list of items to be purchased as parameters.
The query string starts with a question mark (
and consists of name/value pairs separated by ampersands
&). These names and values must
meaning that special characters, such as whitespace, question marks,
ampersands, and all other nonalphanumeric characters are encoded so
that they don't get confused with characters used to
separate name/value pairs and other parts of the URI. In this
example, the space between
Beach is encoded as a plus sign. Other special
characters are encoded as their corresponding hexadecimal ASCII
value; for instance, a question mark is encoded as
%3F. When parameters are sent as part of the
request body, they follow the same syntax: URL-encoded name/value
pairs separated by ampersands.
GET is the
most commonly used request method, intended to retrieve a resource
without causing anything else to happen on the server. The
POST method is
almost as common as
POST requests some kind of
processing on the server; for instance, updating a database or
processing a purchase order.
The way parameters are transferred is one of the most obvious
differences between the
POST request methods. A
request always uses a query string to send parameter values, while a
POST request sends them as part of the body (it
can also send some parameters as a query string, just to make life
interesting). If you insert a link in an HTML page using an
<a> element, clicking on the link results in
GET request being sent to the server. The
GET request uses a query string to pass
parameters, so you can include hardcoded parameter values in the link
<a href="/forecast?city=Hermosa+Beach&state=CA"> Hermosa Beach weather forecast </a>
When you use a form to send user input to the server, you can specify
whether to use the
method with the
method attribute, as shown here:
<form action="/forecast" method="POST"> City: <input name="city" type="text"> State: <input name="state" type="text"> <p> <input type="SUBMIT"> </form>
If the user enters "Hermosa Beach" and "CA" in the form fields and clicks on the Submit button, the browser sends a request message like this to the server:
POST /forecast HTTP/1.1 Host: www.gefionsoftware.com User-Agent: Mozilla/5.0 (Windows; U; Win 9x 4.90; en-US; rv: 1.0.2) Accept: image/gif, image/jpeg, image/pjpeg, image/png, */* Accept-language: en-US Accept-charset: iso-8859-1,*,utf-8 city=Hermosa+Beach&state=CA
Due to the differences in how parameters are sent by
POST requests, as well
as the differences in their intended purpose, browsers handle the
requests in different ways. A
parameters and all, can easily be saved as a bookmark, hardcoded as a
link, and the response cached by the browser. Also, the browser knows
that no damage will be done if it needs to send a
GET request again automatically; for instance, if
the user clicks the Reload button.
POST request, on the other hand,
can't be bookmarked as easily; the browser would
have to save both the URI and the request message body. A
POST request is intended to perform some possibly
irreversible action on the server, so the browser must also ask the
user if it's okay to send the request again. You
have probably seen this type of confirmation dialog, shown in