Spidering websites

Many tools provide the ability to map out websites, but often you are limited to style of output or the location in which the results are provided. This base plate for a spidering script allows you to map out websites in short order with the ability to alter them as you please.

Getting ready

In order for this script to work, you'll need the BeautifulSoup library, which is installable from the apt command with apt-get install python-bs4 or alternatively pip install beautifulsoup4. It's as easy as that.

How to do it…

This is the script that we will be using:

import urllib2 from bs4 import BeautifulSoup import sys urls = [] urls2 = [] tarurl = sys.argv[1] url = urllib2.urlopen(tarurl).read() soup = BeautifulSoup(url) for line in soup.find_all('a'): ...

Get Python: Penetration Testing for Developers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.