Chapter 9. Web Programming

This chapter introduces a variety of Python facilities for manipulating URLs, opening documents in web browsers, submitting HTTP requests to web servers, and executing programs on web servers to respond to HTTP requests. Along the way we will also look at the foundations of the socket technology that underlies web interactions and the basics of constructing HTML forms to use as an interface to server-based programs.

Manipulating URLs: urllib.parse

The urllib.parse module provides functions for manipulating URL strings. The general form of a URL is:

scheme://network_location/path;parameters?query#fragment

The fragment, which doesn’t usually appear if the URL includes a query, is a reference to a particular place within a web page. You may not have seen or noticed the parameters portion of a URL before; it is not usually part of URLs visible to a browser’s user, appearing—relatively rarely—as part of the value of href attributes in HTML tags. The network_location can be further dissected into the following components:

username:password@hostname:port

The username/password combination is a way of supplying login information to sites that accept this primitive kind of authentication. The port needs to be included whenever the server program that responds to a request for the given scheme is listening on a nonstandard port number. We’ll see plenty of concrete examples of various forms of URLs in this chapter.

Certain characters are “reserved” for use in URL syntax ...

Get Bioinformatics Programming Using Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.