How it works

The storing of the document was performed using the following line:

es.index(index='joblistings', doc_type='job-listing', id=job_listing_id, body=listing)

Let's examine what each of these parameters does relative to storing this document.

The index parameter specifies which Elasticsearch index we want to store the document within. That is named joblistings.  This also becomes the first portion of the URL used to retrieve the documents.

Each Elasticsearch index can also have multiple document 'types', which are logical collections of documents that can represent different types of documents within the index.  We used 'job-listing', and that value also forms the second part of our URL for retrieving a specific document.

Elasticsearch ...

Get Python Web Scraping Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.