ICU analysis plugin

Elasticsearch has an ICU analysis plugin. You can use this plugin to use mentioned forms in the previous section, and so ensuring that all of your tokens are in the same form. Note that the plugin must be compatible with the version of Elasticsearch in your machine:

bin/plugin install elasticsearch/elasticsearch-analysis-icu/2.7.0

After installing, the plugin registers itself by default under icu_normalizer or icuNormalizer. You can see an example of the usage as follows:

curl -XPUT /my_index -d '{
  "settings": {
    "analysis": {
      "filter": {
        "nfkc_normalizer": {
          "type": "icu_normalizer",
          "name": "nfkc"
        }
      },
      "analyzer": {
        "my_normalizer": {
          "tokenizer": "icu_tokenizer",
          "filter":  [ "nfkc_normalizer" ]
        }
      }
    }
  }
}'

The preceding configuration ...

Get Elasticsearch Indexing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.