Cover by Rafal Kuc'

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

O'Reilly logo

Detecting the document's language

Imagine a situation where you have users from different countries and you would like to give them a choice to only see content you index that is written in their native language. Sounds quite interesting, right? Let us see how we can identify the language of the documents during indexing and store that information along with the documents in the index for later use.

How to do it...

For the language identification we will use one of the Solr contrib modules, but let's start from the beginning.

  1. For the purpose of the recipe, I assume that we will be using the following index structure (add the following to the fields section of your schema.xml file):
    <field name="id" type="string" indexed="true" stored="true" required="true" ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required