Chapter 1. Introduction

This chapter gives an overview of the architecture of Apache, how to obtain the software, starting and stopping the server, and the basics of configuration files.

Architectural Overview

Apache is normally run as a system daemon or service, with a parent process or thread supervising a number of child processes or threads that perform the request processing. Apart from certain core features, most functionality is implemented by modules, which may be either statically linked into the server or dynamically loaded on startup.

Operating systems vary in how they implement features such as networking and multiprocessing. Apache version 2.0 introduced MultiProcessing Modules (MPMs) to provide networking and scheduling models tailored to particular operating systems and usage patterns, as listed in Table 1-1. MPMs use the native features of the operating system and provide scheduling using processes, threads, or a mix of the two. Apache uses only a single MPM at any time, and it must be statically compiled into the server.

Table 1-1. MultiProcessing modules

Module

Description

beos

Mutithreaded MPM for the BeOS operating system

event

Experimental variant of the worker MPM

mpm_netware

Threaded MPM for Novell Netware

mpm_winnt

Twin process, multithreaded MPM for Windows

mpmt_os2

Hybrid multiprocess, multithreaded MPM for O/S2

prefork

Traditional nonthreaded, preforking MPM

worker

Hybrid multiprocess, multithreaded MPM

The MPMs, other modules, and the core web server build upon the Apache Portable Runtime (APR), which provides a consistent, platform-independent interface to the underlying operating system. APR includes APIs to access SQL databases and LDAP servers; these are used in two framework modules, mod_dbd and mod_ldap, which provide common facilities that other modules may use.

Operational Overview

On startup, Apache goes through an initialization stage before entering its operational state. During the initialization stage, Apache reads and verifies its configuration files, opens network connections and log files, acquires system resources, and creates the pool of child processes or threads that will handle requests. Apache is normally started with root privileges but relinquishes those privileges before it enters the operational state.

Once Apache has entered its operational state, the child processes or threads handle incoming requests. Requests are processed in a number of phases, and each phase provides a number of hooks for modules to participate in the processing. For each hook, Apache calls registered functions in turn until they have all been called or until one of them indicates either that processing for that hook is complete or that an error has occurred.

Modules register handlers for the phases during which they need to influence the handling of the request. Generally, a module only registers handlers for one or two phases.

The phases occur in the following order:

Request parsing

The request URL is mapped into the filesystem namespace.

Security controls

Access control, authentication, and authorization rules are applied.

Request preparation

The request URL and mapped file path are matched against the configuration to determine the content handler and any filters to use, and to set other metadata.

Content generation

Runs the chosen content handler with any filters.

Request logging

Logs the request.

This picture is complicated slightly by the fact that modules can issue subrequests to return a document other than the one requested, or to check what the response would be if a request was made for a different resource.

Current Versions of Apache

At the time of this writing (summer 2008), there are three major versions of Apache in common use: 1.3, 2.0, and 2.2.

Apache 1.3 was released in June 1998 and for many years was the most widely used web server. Work started in 2000 on a new architecture for Apache; the first production release of the new version, Apache 2.0, was made in April 2002. At the same time, a new version-numbering scheme was introduced: odd numbered minor versions, such as 2.1 or 2.3, are development versions; even numbered minor versions, such as 2.0 or 2.2, are stable versions. The first 2.2 release was made in November 2005, and the latest point release, 2.2.9, was made in July 2008.

The Apache website includes documentation on the changes between versions and notes on upgrading.

How to Obtain Apache

The Apache web server is available for most modern computing platforms—most Linux and BSD distributions offer it as a standard package, and it is included in Mac OS X. Binary packages for Microsoft Windows are available from the Apache website and its mirrors, as are source and other binary packages. It is advisable to familiarize yourself with the particulars of the distribution you deploy, as packagers invariably change details to conform with the conventions of their target platform.

Alternatively, compiling Apache from source is quite straightforward, and has the advantage of giving complete control over how it is built, which modules are statically included in the server, and so on. The source distribution includes instructions on the build process.

By default, the source distribution installs into the subdirectories listed in Table 1-2 under /usr/local/apache2. Most third-party distributions use variations on this scheme.

Table 1-2. Layout of standard Apache directories

Directory

Contents

bin

Program files (administrative program files are often placed in an sbin directory).

build

Files used by the apxs utility.

cgi-bin

CGI scripts.

conf

Configuration files (often stored in the /etc directory hierarchy).

error

HTTP error messages in multiple languages.

htdocs

HTML documents.

icons

Icon image files.

include

C language include files needed for compiling third-party modules.

logs

Log files and runtime status files, such as the PID file; however, status files are often stored in a run directory.

man

Manual pages (often stored in the system man directories).

manual

A local copy of the Apache manual.

modules

Loadable modules.

Apache distributions include a number of modules and utility programs; these are listed in Appendix A.

Starting and Stopping Apache

Most packaged distributions of Apache arrange for the server to be started automatically when the system is booted, and stopped when the system is halted.

On Unix-like systems, Apache normally runs as a daemon process. A shell script, apachectl, is included with Apache to automate the process of starting and stopping the daemon. This script is usually invoked by a system startup script. Apache will respond to the following signals sent to the parent process (the process ID of which is stored in the PID file):

TERM

Stops the server by causing the parent process to attempt to kill each of the child processes and then terminate itself.

HUP

Restarts the server by causing the parent process to kill off each of the child processes, reread the configuration files, and then spawn new child processes. Server statistics are reset to zero on a restart.

USR1

Initiates a graceful restart. Child processes exit either after processing the current request or immediately if not currently serving a request. The parent process rereads the configuration files and starts to spawn new child processes to maintain the appropriate number of server processes. Server statistics are not reset on a graceful restart.

WINCH

Initiates a graceful stop. Child processes exit either after processing the current request or immediately if not currently serving a request. The parent process removes the PID file and stops listening on all ports, but continues monitoring until any remaining children have exited or the timeout has expired.

On Windows, if Apache was installed as a service, it can be started and stopped with the NET START and NET STOP commands with the Apache Service Monitor.

Command-Line Options

Should you need to start Apache manually, the server program takes the following command-line options:

-C directive

Processes directive before reading configuration files.

-c directive

Processes directive after reading configuration files.

-d directory

Sets the initial value for ServerRoot.

-D parameter

Defines a parameter that can be used in <IfDefine> sections. Certain startup options are invoked by setting parameters (DEBUG, FOREGROUND, NO_DETACH, ONE_PROCESS).

-E filename

Sets the error log file for server startup.

-e level

Sets the log level for server startup.

-f file

Main configuration file (default is conf/httpd.conf).

-h

Prints a short help message containing a summary of the command-line options.

-k command

Executes one of the following commands: start, restart, graceful, stop, or graceful-stop. Also, on Windows only: install, uninstall.

-L

Lists available configuration directives (provided by compiled-in modules) and exits.

-l

Lists compiled-in modules and exits.

-M

Lists compiled-in and shared modules and exits (equivalent to -D DUMP_MODULES).

-n name

Windows only: service name for Apache.

-S

Shows virtual host settings (equivalent to -D DUMP_VHOSTS).

-t

Tests the syntax of configuration files, checking for the existence of document root directories, and exits.

-v

Prints version and build date and exits.

-V

Shows compilation settings and exits.

-w

Windows only: keeps the console window open after Apache has started.

-X

Runs in single-process debug mode (equivalent to -D DEBUG).

Configuration Files

Every aspect of Apache’s behavior is controlled by directives stored in its configuration files. When Apache starts or restarts, it first reads the main server configuration file from the default location, or from the location specified with the -f command-line argument. Further configuration files may be included with the Include directive.

Configuration files are plain text files that contain configuration directives, blank lines, and comments. Leading whitespace on a line is ignored, as are blank lines. Lines starting with a hash sign (#) are regarded as comments.

Configuration Directive Format

Apache configuration directives are described in a standard format as shown here.

DirectorySlash

SVDH (Indexes)

mod_dir (B)

ON

DirectorySlash { ON | OFF }

Compatibility: 2.0.51 and later

If set to ON, then requests that map to a directory but that do not end in a trailing slash will be redirected to the same URL with a trailing slash appended, enabling automatic directory indexes and relative URLs to work correctly.

The top line gives the name of the directive on the left, and the list of contexts in which the directive may be used on the right, using the abbreviations defined in Table 1-3. If the directive can be used in a per-directory configuration file and is controlled by an AllowOverride directive category, the category keyword is included in parentheses after the context abbreviations.

Table 1-3. Context abbreviations

Context

Description

S

Valid in global context, i.e., in the server configuration files outside of any virtual host or filesystem container sections

V

Valid in virtual host sections

D

Valid in directory-type container sections (<Directory>, <Files>, <Location>, and the *Match variants)

H

Valid in per-directory configuration file (named .htaccess by default)

*

Indicates that the directive may be given more than once in a context

The second line lists the name of the Apache module that implements the directive on the left (see Appendix A for a list of the modules included in the Apache distribution) with the status of the module in brackets, using the abbreviations from Table 1-4. The module may be listed as “MPM,” in which case the MPMs implementing the directive will be noted, or as “core” to indicate that the directive is implemented by the Apache core module. The default value for the directive is shown on the right.

Table 1-4. Module status codes

Status

Description

B

Base module—included in the Apache distribution and compiled-in by default

E

Extension module—included in the Apache distribution but not compiled-in by default

X

Experimental module—included in the Apache distribution but not compiled-in by default

The next line gives the directive syntax, followed by compatibility notes if relevant. Directives are case-insensitive, as are most arguments that do not refer to case-sensitive objects such as filenames.

Basic Configuration File Directives

These directives control where configuration files are located, which additional files are loaded at startup, the name of the per-directory configuration files, and which directives are allowed in those files.

ServerRoot

S

core

depends on compilation settings

ServerRoot directory

Root directory for the server. May be overridden with the -d command-line option. Relative paths for other directives, such as Include and LockFile, are interpreted as being relative to this directory. Binary packages often have different defaults from the standard Apache layout.

Include

SVD*

core

Include { filepath | directory }

Compatibility: wildcard matching available in 2.0.41 and later

Reads and processes the contents of the named configuration file, which is logically included in place of the directive. The filename part of the path may include shell-style wildcard characters, in which case, all matching files are included in lexicographical order. If a directory is specified, then all files in the directory and any subdirectories are included, which is not recommended, as it may pick up unintended files.

AccessFileName

SV*

core

.htaccess

AccessFileName filename ...

Names the per-directory configuration file. Although the directive name and default value imply otherwise, the file is not restricted to access control directives. The AllowOverride directive controls which directives are allowed.

AllowOverride

D

core

All

AllowOverride category ...

The AllowOverride directive is only allowed in non-regular expression <Directory> sections. It specifies whether per-directory configuration files are read for directories matched by the section and for subdirectories of those directories—and, if read, which of the categories of directives listed in Table 1-5 are allowed in those files. If a per-directory configuration file contains directives that are not allowed, an internal server error is generated.

Table 1-5. Per-directory override categories

Category

Meaning

None

Per-directory configuration files are not read at all.

All

All directives are valid in per-directory configuration files.

AuthConfig

Authentication and authorization directives.

FileInfo

Directives controlling document attributes.

Indexes

Directory indexing directives.

Limit

Access control directives.

Options

Directory features.

An AllowOverride directive replaces any settings defined for higher-level directories. The keywords All and None are parsed in the same way as the other keywords, which allows some strange but valid combinations.

Conditional Sections

Conditional sections enclose blocks of directives that Apache should ignore while parsing the configuration files if the condition specified on the section start directive is not met. Conditional sections may be nested.

<IfDefine>

SVDH*

core

<IfDefine [!]parameter >
    ...
</IfDefine>

Enclosed directives are only evaluated if the named parameter is defined with the -d command-line option—or is not defined if parameter is preceded by an exclamation mark (!).

<IfModule>

SVDH*

core

<IfModule [!]module >
    ...
</IfModule>

Enclosed directives are only evaluated if the specified module is active—or inactive if module is preceded by an exclamation mark (!). Modules can be specified by their identifier or by their name, including the trailing ".c" as printed by the -l command line. The use of module identifiers was introduced in version 2.0. This directive can be used to differentiate between 1.3 and the newer version, as the core module is named core.c as of 2.0; in 1.3, it was http_core.c.

<IfVersion>

SVDH*

mod_version (E)

<IfVersion [[!]operator] major[.minor[.patch]] >
    ...
</IfVersion>

Compatibility: 2.0.56 and later

Enclosed directives are only evaluated if the Apache version matches the specified criteria. If the patch and minor version number components are omitted, they are taken as zero. The comparison operator can be one of the following: =, <, <=, >, or >= (== is a synonym for =).

Regular expression matching is also supported: you can use the ~ operator and specify the version as a string, or use the = operator and specify the version as /regex/.

All operators may be preceded by an exclamation mark (!) to reverse their meaning.

Container Sections

Container sections allow the scope of directives to be limited by directory, filename, URL, or request method. The <Directory>, <DirectoryMatch>, <Files>, and <FilesMatch> directives introduce filesystem containers, while the <Location> and <LocationMatch> directives introduce web-space containers. <Limit> and <LimitExcept> introduce container sections limited by request method.

The non-*Match filesystem and webspace container directives each take a shell wildcard pattern argument. These directives have an alternative form in which the first argument is specified as a literal tilde (~) followed by a second argument that is interpreted as a regular expression. This form is exactly equivalent to the corresponding *Match directives, which should be preferred, as the tilde is easy to overlook.

Shell wildcard patterns may contain metacharacters and bracket expressions: ? matches a single character; * matches any number of characters; and [expr] matches any of the characters, or ranges of characters, enclosed between the brackets.

When processing a request, directives within filesystem and webspace sections are applied in the following sequence:

  1. Non-regular expression <Directory> sections and per-directory configuration files, working from shortest to longest pathname. The per-directory configuration files override the <Directory> sections.

  2. <DirectoryMatch> sections.

  3. <Files> and <FilesMatch> sections.

  4. <Location> and <LocationMatch> sections.

Directives in <Directory> and <DirectoryMatch> sections and per-directory configuration files apply to subdirectories unless overridden later.

Container sections for a matching virtual host are applied after those for the main server.

<Directory>

SV*

core

<Directory pattern >
    ...
</Directory>

Container for directives that apply only to directories that match the specified pattern (and their subdirectories).

<DirectoryMatch>

SV*

core

<DirectoryMatch regex >
    ...
</DirectoryMatch>

Enclosed directives apply only to directories (and their subdirectories) that match the specified regular expression.

<Files>

SVDH*

core

<Files pattern >
    ...
</Files>

Enclosed directives apply only to files that match the specified filename pattern.

<FilesMatch>

SVDH*

core

<FilesMatch regex >
    ...
</FilesMatch>

Enclosed directives apply only to files that match the regular expression.

<Location>

SV*

core

<Location pattern >
    ...
</Location>

Enclosed directives apply only to URLs that match the specified pattern.

<LocationMatch>

SV*

core

<LocationMatch regex >
    ...
</LocationMatch>

Enclosed directives apply only to matching URLs.

<Limit>

SVDH*

core

<Limit method ... >
    ...
</Limit>

Enclosed directives apply only to matching methods.

<LimitExcept>

SVDH*

core

<LimitExcept method ... >
    ...
</LimitExcept>

Enclosed directives apply to nonmatching methods.

Get Apache 2 Pocket Reference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.