Using Directed Acyclic Word Graphs

We use Directed Acyclic Word Graphs (DAWG) to retrieve very quickly from a large corpus of strings at an extremely small cost in space complexity. Imagine compressing all words in a dictionary using a DAWG to perform efficient lookups for words. It is a powerful data structure that can come in handy when dealing with a large corpus of words. A very nice introduction to DAWGs can be found in Steve Hanov's blog post here: http://stevehanov.ca/blog/index.php?id=115.

We can use this recipe to incorporate a DAWG in our code.

Getting ready

Install the DAWG package using cabal:

$ cabal install dawg

How to do it...

We name a new file Main.hs and insert the following code:

  1. Import the following packages:
    import qualified Data.DAWG.Static ...

Get Haskell Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.