Chapter 17

Data Mining

Analysis of Structured and Unstructured Information

Dyan Decker, Alexandre Blanc, John Loveland, and Mona Clayton

Companies create, store, and manipulate large volumes of electronic data every day. According to a study by International Data Corp (IDC), a market research firm, around 1,200 exabytes (a billion gigabytes) of digital data will be generated this year, up from an estimate of 150 exabytes in 2005.

When mined appropriately, data provide rich information that can be invaluable to a forensic investigator. This chapter discusses the importance of data mining to an investigation, highlights the differences between structured and unstructured data, and presents some leading practices on how to successfully use data mining in an investigation.

Consider some of the ways that data originate:

  • The business day dawns, and each swipe of a parking pass and building access card creates a record in a security database.
  • A manager e-mails his colleagues information regarding changes to the standard price list for the company's products; a server—possibly several—stores the e-mail in the manager's and recipients’ mailboxes.
  • A sales representative inputs activity from each of the day's sales calls, generating a record in the company's customer relationship management (CRM) system.
  • An accounting clerk inputs a batch of vendor invoices, creating a series of records in the accounts payable module of the company's enterprise resource planning (ERP) system.
  • A member of ...

Get A Guide to Forensic Accounting Investigation, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.