The urlparse Module
The urlparse
module contains functions to process
URLs, and to convert between URLs and platform-specific filenames. Example 7-16 demonstrates.
Example 7-16. Using the urlparse Module
File: urlparse-example-1.py
import urlparse
print urlparse.urlparse("http://host/path;params?query#fragment")
('http', 'host', '/path', 'params', 'query', 'fragment')
A common use is to split an HTTP URL into host and path components (an HTTP request involves asking the host to return data identified by the path), as shown in Example 7-17.
Example 7-17. Using the urlparse Module to Parse HTTP Locators
File: urlparse-example-2.py import urlparse scheme, host, path, params, query, fragment =\ urlparse.urlparse("http://host/path;params?query#fragment") if scheme == "http": print "host", "=>", host if params: path = path + ";" + params if query: path = path + "?" + query print "path", "=>", pathhost => host
path => /path;params?query
Alternatively, Example 7-18 shows how you can use the urlunparse
function to put the URL back together again.
Example 7-18. Using the urlparse Module to Parse HTTP Locators
File: urlparse-example-3.py import urlparse scheme, host, path, params, query, fragment =\ urlparse.urlparse("http://host/path;params?query#fragment") if scheme == "http": print "host", "=>", host print "path", "=>", urlparse.urlunparse( (None, None, path, params, query, None) )host => host
path => /path;params?query
Example 7-19 uses the urljoin
function to combine an absolute URL with ...
Get Python Standard Library now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.