O'Reilly logo

Learning Scrapy by Dimitrios Kouzis-Loukas

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Signals

Signals provide a mechanism to add callbacks to events that happen in the system, such as when a spider opens, or when an item gets scraped. You can hook to them using the crawler.signals.connect() method (an example of using it can be found in the next section). There are just 11 of them and maybe the easiest way to understand them is to see them in action. I created a project where I created an extension that hooks to every available signal. I also created one Item Pipeline, one Downloader and one spider middleware, which also logs every method invocation. The spider it uses is very simple. It just yields two items and then raises an exception:

def parse(self, response): for i in range(2): item = HooksasyncItem() item['name'] = "Hello ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required