How it works...

One of the difficult things to figure out in address matching problems like this one is the value of the weights and how to scale the distances. This might take some exploration and insight into the data itself. Also, when dealing with addresses, we should consider components other than those used here. We can consider the street number as a separate component from the street address, and can even have other components, such as the city and state.

When dealing with numerical address components, note that they can be treated as numbers (with a numerical distance) or as characters (with an edit distance). It is up to you to choose which. Note that we could consider using an edit distance with the ZIP Code, if we think that typos ...

Get TensorFlow Machine Learning Cookbook - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.