Using patterns to replace tokens

Let's assume that we want to search inside user blog posts. We need to prepare a simple search returning only the identifier of the documents that were matched. However, we will want to remove some words because of explicit language. Of course, we can do this using the stop words functionality, but what if we want to know how many documents have their contents censored with compute statistics on. In such a case, we can't use the stop words functionality, we need something more, which means that we need regular expressions. This recipe will show you how to achieve such requirements using Solr and one of its filters.

How to do it...

To achieve our needs, we will use the solr.PatternReplaceFilterFactory filter. Let's ...

Get Solr Cookbook - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.