In this script, we can see how to extract links using urllib2 and HTMLParser. HTMLParser is a module that allows us to parse text files formatted in HTML.
You can get more information at https://docs.python.org/2/library/htmlparser.html.
You can find the following code in the get_links_from_url.py file:
#!/usr/bin/pythonimport urllib2from HTMLParser import HTMLParserclass myParser(HTMLParser): def handle_starttag(self, tag, attrs): if (tag == "a"): for a in attrs: if (a[0] == 'href'): link = a[1] if (link.find('http') >= 0): print(link) newParse = myParser() newParse.feed(link)web = raw_input("Enter url: ")url = "http://"+webrequest = urllib2.Request(url)handle = urllib2.urlopen(request)parser = myParser() ...