How it works

This configuration essentially tells Scrapy that if a request for a page fails with any of the RETRY_HTTP_CODES, and for up to RETRY_TIMES per URL, then use a proxy from within the file specified by PROXY_LIST, and by using the pattern defined by PROXY_MODE. With this, you can have Scrapy fail back to any number of proxy servers to retry the request from a different IP address and/or port.

Get Python Web Scraping Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.