Splitting a string on lines, words, or arbitrary tokens

Useful data is often interspersed between delimiters, such as commas or spaces, making string splitting vital for most data analysis tasks.

Getting ready

Create an input.txt file similar to the following one:

$ cat input.txt

first line
second line
words are split by space
comma,separated,values
or any delimiter you want

Install the split package using Cabal as follows:

$ cabal install split

How to do it...

  1. The only function we will need is splitOn, which is imported as follows:
    import Data.List.Split (splitOn)
  2. First we split the string into lines, as shown in the following code snippet:
    main = do 
      input <- readFile "input.txt"
      let ls = lines input
      print $ ls
  3. The lines are printed in a list as follows: ...

Get Haskell Data Analysis Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.