Detecting the document language during indexation

Imagine a situation when you have users from different countries and you would like to give them a choice to only see content you index that is written in their native language. However, there is one problem; your documents don't have their language identified, so we need to do this ourselves. Let's see how we can identify the language of the documents during indexing time and store this information along with the documents in the index for later use.

How to do it...

For language identification, we will use one of the Solr contribution modules, but let's start from the beginning:

  1. For the purpose of the recipe, I assume that we will use the following index structure (we just need to add the following ...

Get Solr Cookbook - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.