Implementing the divergence from randomness model

In Lucene, divergence from randomness model is implemented as DFRSimilarity. It's made up of three components: BasicModel, AfterEffect, and Normalization. BasicModel is a model of information content, AfterEffect is the first normalization, and Normalization is second (length) normalization. Here is an excerpt from Lucene's Javadoc on DFRSimilarity's components:

  1. BasicModel: This is a basic model of information content:
    • BasicModelBE: This is the limiting form of Bose-Einstein
    • BasicModelG: This is the geometric approximation of Bose-Einstein
    • BasicModelP: This is the Poisson approximation of the Binomial
    • BasicModelD: This is the divergence approximation of the Binomial
    • BasicModelIn: This is the inverse ...

Get Lucene 4 Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.