Trails of activity

We can leverage the referrer (famously misspelled referer) information to track access around a web site. As with other interesting fields, we need to decompose this into host name and path information. The most reliable way to do this is to use the urllib.parse module.

This means that we'll need to make a change to our log_event_2() function to add yet another parsing step. When we parse the referrer URL, we'll get at least six pieces of information:

  • scheme: This is usually http.
  • netloc: This is the server which made the referral. This will be the name of the server, not the IP address.
  • path: This is the path to the page which had the link.
  • params: This can be anything after the ? symbol in a URL. Usually, this is empty for simple ...

Get Python for Secret Agents - Volume II now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.