Finding Similarities in Data

At the time of writing there were 73,445 mails to the Erlang mailing list.[42] This represents a vast store of knowledge that can be reused in many ways. We can use it to answer questions about Erlang or for inspiration.

Suppose you’re writing a program and need help. This is where Sherlock comes in. Sherlock analyzes your program and then suggests the most similar posting from all previous mails in the Erlang list that might be able to help you.

So, here’s the master plan. Step 1 is to download the entire Erlang mailing list and store it locally. Step 2 is to organize and parse all the mails. Step 3 is to compute some property of the mails that can be used for similarity searches. Step 4 is to query the mails ...

Get Programming Erlang, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.