Implementing the divergence from randomness model

In Lucene, divergence from randomness model is implemented as DFRSimilarity. It's made up of three components: BasicModel, AfterEffect, and Normalization. BasicModel is a model of information content, AfterEffect is the first normalization, and Normalization is second (length) normalization. Here is an excerpt from Lucene's Javadoc on DFRSimilarity's components:

BasicModel: This is a basic model of information content:
- BasicModelBE: This is the limiting form of Bose-Einstein
- BasicModelG: This is the geometric approximation of Bose-Einstein
- BasicModelP: This is the Poisson approximation of the Binomial
- BasicModelD: This is the divergence approximation of the Binomial
- BasicModelIn: This is the inverse ...

Get Lucene 4 Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Lucene 4 Cookbook by Edwood Ng, Vineeth Mohan

Implementing the divergence from randomness model

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly