Most of the time you have one spider per source web-site, but there are cases where you want to scrape data from many websites and the only thing that changes between them is the XPath expressions you use. In these cases, it feels like overkill to have a spider for every site. Can you crawl through them all with a single spider? The answer is yes.
Let's create a new project for this experiment as the items that we crawl are very different (actually we won't define any in this project!). I assume that we were in the
properties directory of
ch05. Let's go one level up, as follows:
$ pwd /root/book/ch05/properties $ cd .. $ pwd /root/book/ch05
We can create a new project named
generic and a spider named ...