Computing the Jaro-Winkler distance between two strings

The Jaro-Winkler distance measures string similarity represented as a real number between 0 and 1. The value 0 corresponds to no similarity, and 1 corresponds to an identical match.

Getting ready

The algorithm behind the function comes from the following mathematical formula presented in the Wikipedia article about the Jaro-Winkler distance http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance:

Getting ready

In the preceding formula, the following are the representations of the variables used:

  • s1 is the first string.
  • s2 is the second string.
  • m is the number of identical characters within a distance of at ...

Get Haskell Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.