There's more...

There are other text distance metrics that we should discuss. Here is a definition table describing other text distances between two strings, s1 and s2:

Name

Description

Formula

Hamming distance

Number of equal character positions. Only valid if the strings are equal lengths.

, where I is an indicator function of equal characters.

Cosine distance

The dot product of the k-gram differences divided by the L2 norm of the k-gram differences.

Jaccard distance

Number of characters in common, divided by ...

Get TensorFlow Machine Learning Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.