Traversing online directories for data

A directory search typically provides names and contact information per query. By brute forcing many of these search queries, we can obtain all data stored in the directory listing database. This recipe runs thousands of search queries to obtain as much data as possible from a directory search. This recipe is provided only as a learning tool to see the power and simplicity of data gathering in Haskell.

Getting ready

Make sure to have a strong Internet connection.

Install the hxt and HandsomeSoup packages using Cabal:

$ cabal install hxt
$ cabal install HandsomeSoup

How to do it...

  1. Set up the following dependencies:
    import Network.HTTP
    import Network.URI
    import Text.XML.HXT.Core
    import Text.HandsomeSoup
  2. Define a ...

Get Haskell Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.