WAR Files and Deployment

As we described in the introduction to this chapter, a WAR file is an archive that contains all the parts of a web application: Java class files for servlets and web services, JSPs, HTML pages, images, and other resources. The WAR file is simply a JAR file (which is itself a fancy ZIP file) with specified directories for the Java code and one designated configuration file: the web.xml file, which tells the application server what to run and how to run it. WAR files always have the extension .war, but they can be created and read with the standard jar tool.

The contents of a typical WAR might look like this, as revealed by the jar tool:

    $ jar tvf shoppingcart.war

        index.html
        purchase.html
        receipt.html
        images/happybunny.gif
        WEB-INF/web.xml
        WEB-INF/classes/com/mycompany/PurchaseServlet.class
        WEB-INF/classes/com/mycompany/ReturnServlet.class
        WEB-INF/lib/thirdparty.jar

When deployed, the name of the WAR becomes, by default, the root path of the web application—in this case, shoppingcart. Thus, the base URL for this web app, if deployed on http://www.oreilly.com, is http://www.oreilly.com/shoppingcart/, and all references to its documents, images, and servlets start with that path. The top level of the WAR file becomes the document root (base directory) for serving files. Our index.html file appears at the base URL we just mentioned, and our happybunny.gif image is referenced as http://www.oreilly.com/shoppingcart/images/happybunny.gif.

The WEB-INF directory (all caps, hyphenated) is a special directory that contains all deployment information and application code. This directory is protected by the web server, and its contents are not visible to outside users of the application, even if you add WEB-INF to the base URL. Your application classes can load additional files from this area using getResource() on the servlet context, however, so it is a safe place to store application resources. The WEB-INF directory also contains the web.xml file, which we’ll talk more about in the next section.

The WEB-INF/classes and WEB-INF/lib directories contain Java class files and JAR libraries, respectively. The WEB-INF/classes directory is automatically added to the classpath of the web application, so any class files placed here (using the normal Java package conventions) are available to the application. After that, any JAR files located in WEB-INF/lib are appended to the web app’s classpath (the order in which they are appended is, unfortunately, not specified). You can place your classes in either location. During development, it is often easier to work with the “loose” classes directory and use the lib directory for supporting classes and third-party tools. It’s also possible to install JAR files directly in the servlet container to make them available to all web apps running on that server. This is often done for common libraries that will be used by many web apps. The location for placing the libraries, however, is not standard and any classes that are deployed in this way cannot be automatically reloaded if changed—a feature of WAR files that we’ll discuss later. Servlet API requires that each server provide a directory for these extension JARs and that the classes there will be loaded by a single classloader and made visible to the web application.

Configuration with web.xml and Annotations

The web.xml file is an XML configuration file that lists servlets and related entities to be deployed, the relative names (URL paths) under which to deploy them, their initialization parameters, and their deployment details, including security and authorization. For most of the history of Java web applications, this was the only deployment configuration mechanism. However, as of the Servlet 3.0 API, there are additional options. Most configuration can now be done using Java annotations. We saw the WebServlet annotation used in the first example, HelloClient, to declare the servlet and specify its deployment URL path. Using the annotation, we could deploy the servlet to the Tomcat server without any web.xml file. Another option with the Servlet 3.0 API is to deploy servlet procedurally—using Java code at runtime.

In this section we will describe both the XML and annotation style of configuration. For most purposes, you will find it easier to use the annotations, but there are a couple of reasons to understand the XML configuration as well. First, the web.xml can be used to override or extend the hardcoded annotation configuration. Using the XML, you can change configuration at deployment time without recompiling the classes. In general, configuration in the XML will take precedence over the annotations. It is also possible to tell the server to ignore the annotations completely, using an attribute called metadata-complete in the web.xml. Next, there may be some residual configuration, especially relating to options of the servlet container, which can only be done through XML.

We will assume that you have at least a passing familiarity with XML, but you can simply copy these examples in a cut-and-paste fashion. (For details about working with Java and XML, see Chapter 24.) Let’s start with a simple web.xml file for our HelloClient servlet example. It looks like this:

    <web-app>
        <servlet>
            <servlet-name>helloclient1</servlet-name>
            <servlet-class>HelloClient</servlet-class>
        </servlet>
        <servlet-mapping>
            <servlet-name>helloclient1</servlet-name>
            <url-pattern>/hello</url-pattern>
        </servlet-mapping>
    </web-app>

The top-level element of the document is called <web-app>. Many types of entries may appear inside the <web-app>, but the most basic are <servlet> declarations and <servlet-mapping> deployment mappings. The <servlet> declaration tag is used to declare an instance of a servlet and, optionally, to give it initialization and other parameters. One instance of the servlet class is instantiated for each <servlet> tag appearing in the web.xml file.

At minimum, the <servlet> declaration requires two pieces of information: a <servlet-name>, which serves as a handle to reference the servlet elsewhere in the web.xml file, and the <servlet-class> tag, which specifies the Java class name of the servlet. Here, we named the servlet helloclient1. We named it like this to emphasize that we could declare other instances of the same servlet if we wanted to, possibly giving them different initialization parameters, etc. The class name for our servlet is, of course, HelloClient. In a real application, the servlet class would likely have a full package name, such as com.oreilly.servlets.HelloClient.

A servlet declaration may also include one or more initialization parameters, which are made available to the servlet through the ServletConfig object’s getInitParameter() method:

    <servlet>
        <servlet-name>helloclient1</servlet-name>
        <servlet-class>HelloClient</servlet-class>
        <init-param>
            <param-name>foo</param-name>
            <param-value>bar</param-value>
        </init-param>
    </servlet>

Next, we have our <servlet-mapping>, which associates the servlet instance with a path on the web server:

    <servlet-mapping>
        <servlet-name>helloclient1</servlet-name>
        <url-pattern>/hello</url-pattern>
    </servlet-mapping>

Here we mapped our servlet to the path /hello. (We could include additional url-patterns in the mapping if desired.) If we later name our WAR learningjava.war and deploy it on www.oreilly.com, the full path to this servlet would be http://www.oreilly.com/learningjava/hello. Just as we could declare more than one servlet instance with the <servlet> tag, we could declare more than one <servlet-mapping> for a given servlet instance. We could, for example, redundantly map the same helloclient1 instance to the paths /hello and /hola. The <url-pattern> tag provides some very flexible ways to specify the URLs that should match a servlet. We’ll talk about this in detail in the next section.

Finally, we should mention that although the web.xml example listed earlier will work on some application servers, it is technically incomplete because it is missing formal information that specifies the version of XML it is using and the version of the web.xml file standard with which it complies. To make it fully compliant with the standards, add a line such as:

    <?xml version="1.0" encoding="ISO-8859-1"?>

As of Servlet API 2.5, the web.xml version information takes advantage of XML Schemas. (We’ll talk about XML DTDs and XML Schemas in Chapter 24.) The additional information is inserted into the <web-app> element:

   <web-app
        xmlns="http://java.sun.com/xml/ns/j2ee"  
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://java.sun.com/xml/ns/j2ee
        http://java.sun.com/xml/ns/j2ee/web-app_2_5.xsd”
        version=2.5>

If you leave them out, the application may still run, but it will be harder for the servlet container to detect errors in your configuration and give you clear error messages.

The equivalent of the preceding servlet declaration and mapping is, as we saw earlier, our one line annotation:

@WebServlet(urlPatterns={"/hello", "/hola"})
public class HelloClient extends HttpServlet {
   ...
}

Here the WebServlet attribute urlPatterns allows us to specify one or more URL patterns that are the equivalent to the url-pattern declaration in the web.xml.

URL Pattern Mappings

The <url-pattern> specified in the previous example was a simple string, /hello. For this pattern, only an exact match of the base URL followed by /hello would invoke our servlet. The <url-pattern> tag is capable of more powerful patterns, however, including wildcards. For example, specifying a <url-pattern> of /hello* allows our servlet to be invoked by URLs such as http://www.oreilly.com/learningjava/helloworld or .../hellobaby. You can even specify wildcards with extensions (e.g., *.html or *.foo, meaning that the servlet is invoked for any path that ends with those characters).

Using wildcards can result in more than one match. Consider URLs ending in /scooby* and /scoobydoo*. Which should be matched for a URL ending in .../scoobydoobiedoo? What if we have a third possible match because of a wildcard suffix extension mapping? The rules for resolving these are as follows.

First, any exact match is taken. For example, /hello matches the /hello URL pattern in our example regardless of any additional /hello*. Failing that, the container looks for the longest prefix match. So /scoobydoobiedoo matches the second pattern, /scoobydoo*, because it is longer and presumably more specific. Failing any matches there, the container looks at wildcard suffix mappings. A request ending in .foo matches a *.foo mapping at this point in the process. Finally, failing any matches there, the container looks for a default, catchall mapping named /*. A servlet mapped to /* picks up anything unmatched by this point. If there is no default servlet mapping, the request fails with a “404 not found” message.

Deploying HelloClient

Once you’ve deployed the HelloClient servlet, it should be easy to add examples to the WAR as you work with them in this chapter. In this section, we’ll show you how to build a WAR by hand. In “Building WAR Files with Ant” later in this chapter, we’ll show a more realistic way to manage your applications using the popular build tool, Ant. You can also grab the full set of examples, along with their source code, in the learningjava.war file from this book’s website at http://oreil.ly/Java_4E.

To create the WAR by hand, we first create the WEB-INF and WEB-INF/classes directories. If you are using a web.xml file, place it into WEB-INF. Put the HelloClient.class into WEB-INF/classes. Use the jar command to create learningjava.war (WEB-INF at the “top” level of the archive):

    $ jar cvf learningjava.war WEB-INF

You can also include documents and other resources in the WAR by adding their names after the WEB-INF directory. This command produces the file learningjava.war. You can verify the contents using the jar command:

    $ jar tvf learningjava.war
    document1.html
    WEB-INF/web.xml
    WEB-INF/classes/HelloClient.class

Now all that is necessary is to drop the WAR into the correct location for your server. If you have not already, you should download and install Apache Tomcat. The location for WAR files is the webapps directory within your Tomcat installation directory. Place your WAR here, and start the server. If Tomcat is configured with the default port number, you should be able to point to the HelloClient servlet with one of two URLs: http://localhost:8080/learningjava/hello or http://<yourserver>:8080/learningjava/hello, where <yourserver> is the name or IP address of your server. If you have trouble, look in the logs directory of the Tomcat folder for errors.

Reloading web apps

All servlet containers are supposed to provide a facility for reloading WAR files; many support reloading of individual servlet classes after they have been modified. Reloading WARs is part of the servlet specification and is especially useful during development. Support for reloading web apps varies from server to server. Normally, all that you have to do is drop a new WAR in place of the old one in the proper location (e.g., the webapps directory for Tomcat) and the container shuts down the old application and deploys the new version. This works in Tomcat when the “autoDeploy” attribute is set (it is on by default) and also in BEA’s WebLogic application server when it is configured in development mode.

Some servers, including Tomcat, “explode” WARs by unpacking them into a directory under the webapps directory, or they allow you explicitly to configure a root directory (or “context”) for your unpacked web app through their own configuration files. In this mode, they may allow you to replace individual files, which can be especially useful for tweaking HTML or JSPs. Tomcat automatically reloads WAR files when they change them (unless configured not to), so all you have to do is drop an updated WAR over the old one and it will redeploy it as necessary. In some cases, it may be necessary to restart the server to make all changes take effect. When in doubt, shut down and restart.

Tomcat also provides a client-side “deployer” package that integrates with Ant to automate building, deploying, and redeploying applications. We’ll discuss Ant later in this chapter.

Error and Index Pages

One of the finer points of writing a professional-looking web application is taking care to handle errors well. Nothing annoys a user more than getting a funny-looking page with some technical mumbo-jumbo error information on it when he expected the receipt for his Christmas present. Through the web.xml file, it is possible to specify documents or servlets to handle error pages that are shown for various conditions, as well as the special case of welcome files (index files) that are invoked for paths corresponding to directories. At this time, there is no corresponding way to declare error pages or welcome files using annotations.

You can designate a page or servlet that can handle various HTTP error status codes, such as “404 Not Found” and “403 Forbidden,” using one or more <error-page>declarations:

    <web-app>
    ...
        <error-page>
             <error-code>404</error-code>
             <location>/notfound.html</location>
        </error-page>
        <error-page>
            <error-code>403</error-code>
            <location>/secret.html</location>
        </error-page>

Additionally, you can designate error pages based on Java exception types that may be thrown from the servlet. For example:

    <error-page>
        <exception-type>java.lang.IOException</exception-type>
        <location>/ioexception.html</location>
    </error-page>

This declaration catches any IOExceptions generated from servlets in the web app and displays the ioexception.html page. If no matching exceptions are found in the <error-page> declarations, and the exception is of type ServletException (or a subclass), the container makes a second try to find the correct handler. It looks for a wrapped exception (the “cause” exception) contained in the ServletException and attempts to match it to an error page declaration.

In the Servlet 3.0 API, you can also designate a catchall error page that will handle any unhandled error codes and exception types as follows:

    <error-page>
        <location>/anyerror.html</location>
    </error-page>

As we’ve mentioned, you can use a servlet to handle your error pages, just as you can use a static document. In fact, the container supplies several helpful pieces of information to an error-handling servlet, which the servlet can use in generating a response. The information is made available in the form of servlet request attributes through the method getAttribute():

    Object requestAttribute = servletRequest.getAttribute("name");

Attributes are like servlet parameters, except that they can be arbitrary objects. We have seen attributes of the ServletContext in The ServletContext API section. In this case, we are talking about attributes of the request. When a servlet (or JSP or filter) is invoked to handle an error condition, the following string attributes are set in the request:

    javax.servlet.error.servlet_name
    javax.servlet.error.request_uri
    javax.servlet.error.message

Depending on whether the <error-page> declaration was based on an <error-code> or <exception-type> condition, the request also contains one of the following two attributes:

    // status code Integer or Exception object
    javax.servlet.error.status_code
    javax.servlet.error.exception

In the case of a status code, the attribute is an Integer representing the code. In the case of the exception type, the object is the actual instigating exception.

Indexes for directory paths can be designated in a similar way. Normally, when a user specifies a directory URL path, the web server searches for a default file in that directory to be displayed. The most common example of this is the ubiquitous index.html file. You can designate your own ordered list of files to look for by adding a <welcome-file-list> entry to your web.xml file. For example:

    <welcome-file-list>
        <welcome-file>index.html</welcome-file>
        <welcome-file>index.htm</welcome-file>
    </welcome-file-list>

<welcome-file-list> specifies that when a partial request (directory path) is received, the server should search first for a file named index.html and, if that is not found, a file called index.htm. If none of the specified welcome files is found, it is left up to the server to decide what kind of page to display. Servers are generally configured to display a directory-like listing or to produce an error message.

Security and Authentication

One of the most powerful features of web app deployment with the Servlet API is the ability to define declarative security constraints, meaning that you can spell out in the web.xml file exactly which areas of your web app (URL paths to documents, directories, servlets, etc.) are login-protected, the types of users allowed access to them, and the class of security protocol required for communications. It is not necessary to write code in your servlets to implement these basic security procedures.

There are two types of entries in the web.xml file that control security and authentication. First are the <security-constraint> entries, which provide authorization based on user roles and secure transport of data, if desired. Second is the <login-config> entry, which determines the kind of authentication used for the web application.

Protecting Resources with Roles

Let’s take a look at a simple example. The following web.xml excerpt defines an area called “Secret documents” with a URL pattern of /secret/* and designates that only users with the role “secretagent” may access them. It specifies the simplest form of login process: the BASIC authentication model, which causes the browser to prompt the user with a simple pop-up username and password dialog box:

    <web-app>
    ...
        <security-constraint>
            <web-resource-collection>
                <web-resource-name>Secret documents</web-resource-name>
                <url-pattern>/secret/*</url-pattern>
            </web-resource-collection>
            <auth-constraint>
                <role-name>secretagent</role-name>
            </auth-constraint>
        </security-constraint>

        <login-config>
            <auth-method>BASIC</auth-method>
        </login-config>

Each <security-constraint> block has one <web-resource-collection> section that designates a named list of URL patterns for areas of the web app, followed by an <auth-constraint> section listing user roles that are allowed to access those areas.

We can do the equivalent configuration for a given servlet using the SecurityServlet annotation with an HttpConstraint annotation element as follows:

@ServletSecurity(
    @HttpConstraint(rolesAllowed = "secretagent")
)
public class SecureHelloClient extends HttpServlet
{ ...

You can add this annotation to our test servlet or add the XML example setup to the web.xml file for the learningjava.war file and prepare to try it out. However, there is one additional step that you’ll have to take to get this working: create the user role “secretagent” and an actual user with this role in our application server environment.

Access to protected areas is granted to user roles, not individual users. A user role is effectively just a group of users; instead of granting access to individual users by name, you grant access to roles, and users are assigned one or more roles. A user role is an abstraction from users. Actual user information (name and password, etc.) is handled outside the scope of the web app, in the application server environment (possibly integrated with the host platform operating system). Generally, application servers have their own tools for creating users and assigning individuals (or actual groups of users) their roles. A given username may have many roles associated with it.

When attempting to access a login-protected area, the user’s valid login will be assessed to see if she has the correct role for access. For the Tomcat server, adding test users and assigning them roles is easy; simply edit the file conf/tomcat-users.xml. To add a user named “bond” with the “secretagent” role, you’d add an entry such as:

  <user username="bond" password="007" roles="secretagent"/>

For other servers, you’ll have to refer to the documentation to determine how to add users and assign security roles.

Secure Data Transport

Before we move on, there is one more piece of the security constraint to discuss: the transport guarantee. Each <security-constraint> block may end with a <user-data-constraint> entry, which designates one of three levels of transport security for the protocol used to transfer data to and from the protected area over the Internet. For example:

    <security-constraint>
    ...
        <user-data-constraint>
            <transport-guarantee>CONFIDENTIAL</transport-guarantee>
        </user-data-constraint>
    </security-constraint>

The three levels are NONE, INTEGRAL, and CONFIDENTIAL. NONE is equivalent to leaving out the section, which indicates that no special transport is required. This is the standard for normal web traffic, which is generally sent in plain text over the network. The INTEGRAL level of security specifies that any transport protocol used must guarantee the data sent is not modified in transit. This implies the use of digital signatures or some other method of validating the data at the receiving end, but it does not require that the data be encrypted and hidden while it is transported. Finally, CONFIDENTIAL implies both INTEGRAL and encrypted. In practice, the only widely used secure transport in web browsers is SSL. Requiring a transport guarantee other than NONE typically forces the use of SSL by the client browser.

We can configure the equivalent transport security for a servlet using the ServletSecurity annotation along with the HttpMethodConstraint annotation, as follows:

@ServletSecurity(
    httpMethodConstraints = @HttpMethodConstraint( value="GET",
        transportGuarantee = ServletSecurity.TransportGuarantee.CONFIDENTIAL)
)
public class SecureHelloClient extends HttpServlet { ... }

@ServletSecurity(
    value = @HttpConstraint(rolesAllowed = "secretagent"),
    httpMethodConstraints = @HttpMethodConstraint( value="GET",
        transportGuarantee = ServletSecurity.TransportGuarantee.CONFIDENTIAL)
)
public class SecureHelloClient extends HttpServlet { ... }

Here we use the httpMethodConstraints attribute with an HttpMethodConstraint annotation to designate that the servlet may only be accessed using the HTTP GET method and only with CONFIDENTIAL level security. Combining the transport security with a rolesAllowed annotation can be done as shown in the preceding example.

Authenticating Users

This section shows how to declare a custom login form to perform user login. First, we’ll show the web.xml style and then discuss the Servlet 3.0 alternative, which gives us more flexibility.

The <login-conf> section determines exactly how a user authenticates herself (logs in) to the protected area. The <auth-method> tag allows four types of login authentication to be specified: BASIC, DIGEST, FORM, and CLIENT-CERT. In our example, we showed the BASIC method, which uses the standard web browser login and password dialog. BASIC authentication sends the user’s name and password in plain text over the Internet unless a transport guarantee has been used separately to start SSL and encrypt the data stream. DIGEST is a variation on BASIC that obscures the text of the password but adds little real security; it is not widely used. FORM is equivalent to BASIC, but instead of using the browser’s dialog, we can use our own HTML form to post the username and password data to the container. The form data can come from a static HTML page or from one generated by a servlet. Again, form data is sent in plain text unless otherwise protected by a transport guarantee (SSL). CLIENT-CERT is an interesting option. It specifies that the client must be identified using a client-side public key certificate. This implies the use of a protocol like SSL, which allows for secure exchange and mutual authentication using digital certificates. The exact method of setting up a client-side certificate is browser-dependent.

The FORM method is most useful because it allows us to customize the look of the login page (we recommend using SSL to secure the data stream). We can also specify an error page to use if the authentication fails. Here is a sample <login-config> using the form method:

    <login-config>
        <auth-method>FORM</auth-method>
        <form-login-config>
            <form-login-page>/login.html</form-login-page>
            <form-error-page>/login_error.html</form-error-page>
        </form-login-config>
    </login-config>

The login page must contain an HTML form with a specially named pair of fields for the name and password. Here is a simple login.html file:

    <html>
    <head><title>Login</title></head>
    <body>
        <form method="POST" action="j_security_check">
            Username: <input type="text" name="j_username"><br>
            Password: <input type="password" name="j_password"><br>
            <input type="submit" value="submit">
        </form>
    </body>
    </html>

The username field is called j_username, the password field is called j_password, and the URL used for the form action attribute is j_security_check. There are no special requirements for the error page, but normally you will want to provide a “try again” message and repeat the login form.

In the Servlet 3.0 API, the HttpServletRequest API contains methods for explicitly logging in and logging out a user. However, it is also specified that a user’s login is no longer valid after the user session times out or is invalidated. Therefore, you can effectively log out the user by calling invalidate() on the session:

    request.logout();    request.getSession().invalidate();

With Servlet 3.0, we can also take control of the login process ourselves by utilizing the ServletRequest login() method to perform our own login operation. All we have to do is arrange our own login servlet that accepts a username and password (securely) and then calls the login method. This gives you great flexibility over how and when the user login occurs. And, of course, you can log the user out with the corresponding logout() method.

@ServletSecurity(
    httpMethodConstraints = @HttpMethodConstraint( value="POST",
        transportGuarantee = ServletSecurity.TransportGuarantee.CONFIDENTIAL)
)
@WebServlet( urlPatterns={"/mylogin"} )
public class MyLogin extends HttpServlet
{
    public void doGet(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException
    {
       String user = request.getParameter("user");
       String password = request.getParameter("pass");
       request.login( user, password );
       // Dispatch or redirect to the next page...
    }

Procedural Authorization

We should mention that in addition to the declarative security offered by the web.xml file, servlets may perform their own active procedural (or programmatic) security using all the authentication information available to the container. We won’t cover this in detail, but here are the basics.

The name of the authenticated user is available through the method HttpServletRequest getRemoteUser(), and the type of authentication provided can be determined with the getAuthType() method. Servlets can work with security roles using the isUserInRole() method. (Doing this requires adding some additional mappings in the web.xml file, which allows the servlet to refer to the security roles by reference names.)

For advanced applications, a java.security.Principal object for the user can be retrieved with the getUserPrincipal() method of the request. In the case where a secure transport like SSL was used, the method isSecure() returns true, and detailed information about how the principal was authenticated—the cipher type, key size, and certificate chain—is made available through request attributes. It is useful to note that the notion of being “logged in” to a web application, from the servlet container’s point of view, is defined as there being a valid (non-null) value returned by the getUserPrincipal() method.

Get Learning Java, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.