Stemming different languages
Stemming is a very common requirement; it is the process of reducing words to their root form (or stems). Let's imagine the book e-commerce store, where you store the books' names and descriptions. We want to be able to find words such as shown
and showed
when you type the word show
, and vice versa. We can achieve this requirement using stemming algorithms. The thing is, there are no general stemmers; they are language-specific. This recipe will show you how to add stemming to your data analysis chain and where to look for a list of stemmers.
How to do it...
To achieve our requirement to stem English, we need to take certain steps:
- We will start with the index structure. Let's assume that our index consists of three fields ...
Get Solr Cookbook - Third Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.