What Does a Web Server Do?

The whole business of a web server is to translate a URL either into a filename, and then send that file back over the Internet, or into a program name, and then run that program and send its output back. That is the meat of what it does: all the rest is trimming.

When you fire up your browser and connect to the URL of someone’s home page — say the notional http://www.butterthlies.com/ we shall meet later on — you send a message across the Internet to the machine at that address. That machine, you hope, is up and running; its Internet connection is working; and it is ready to receive and act on your message.

URL stands for Uniform Resource Locator. A URL such as http://www.butterthlies.com/ comes in three parts:

<scheme>://<host>/<path>

So, in our example, < scheme> is http, meaning that the browser should use HTTP (Hypertext Transfer Protocol); <host> is www.butterthlies.com ; and <path> is /, traditionally meaning the top page of the host.[1] The <host> may contain either an IP address or a name, which the browser will then convert to an IP address. Using HTTP 1.1, your browser might send the following request to the computer at that IP address:

GET / HTTP/1.1
Host: www.butterthlies.com

The request arrives at port 80 (the default HTTP port) on the host www.butterthlies.com. The message is again in four parts: a method (an HTTP method, not a URL method), that in this case is GET, but could equally be PUT, POST, DELETE, or CONNECT; the Uniform Resource Identifier (URI) /; the version of the protocol we are using; and a series of headers that modify the request (in this case, a Host header, which is used for name-based virtual hosting: see Chapter 4). It is then up to the web server running on that host to make something of this message.

The host machine may be a whole cluster of hypercomputers costing an oil sheik’s ransom or just a humble PC. In either case, it had better be running a web server, a program that listens to the network and accepts and acts on this sort of message.



[1] Note that since a URL has no predefined meaning, this really is just a tradition, though a pretty well entrenched one in this case.

Get Apache: The Definitive Guide, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.