Module urllib Revisited

The httplib module we just met provides low-level control for HTTP clients. When dealing with items available on the Web, though, it’s often easier to code downloads with Python’s standard urllib module, introduced in the FTP section earlier in this chapter. Since this module is another way to talk HTTP, let’s expand on its interfaces here.

Recall that given a URL, urllib either downloads the requested object over the Net to a local file, or gives us a file-like object from which we can read the requested object’s contents. As a result, the script in Example 14-30 does the same work as the httplib script we just wrote, but requires noticeably less code.

Example 14-30. PP3E\Internet\Other\http-getfile-urllib1.py

################################################################### # fetch a file from an HTTP (web) server over sockets via urllib; # urllib supports HTTP, FTP, files, etc. via URL address strings; # for HTTP, the URL can name a file or trigger a remote CGI script; # see also the urllib example in the FTP section, and the CGI # script invocation in a later chapter; files can be fetched over # the net with Python in many ways that vary in complexity and # server requirements: sockets, FTP, HTTP, urllib, CGI outputs; # caveat: should run urllib.quote on filename--see later chapters; ################################################################### import sys, urllib showlines = 6 try: servername, filename = sys.argv[1:] # cmdline args? except: servername, ...

Get Programming Python, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.