This crawler is implemented as a Scrapy spider. The class definition begins with declaring the spider name and the start URL:
class Spider(scrapy.Spider): name = 'spider' start_urls = ['https://blog.scrapinghub.com']
The parse method looks for CSS 'div.prev-post > a', and follows those links.
The scraper also defines a close method, which is called by Scrapy when the crawl is complete:
def close(spider, reason): start_time = spider.crawler.stats.get_value('start_time') finish_time = spider.crawler.stats.get_value('finish_time') print("Total run time: ", finish_time-start_time)
This accesses the spiders crawler stats object, retrieves the start and finish time for the spider, and reports the difference to the user.