Parse that data! Practical tips for preparing your raw data for analysis

P. Guo    University of Rochester, Rochester, NY, United States

Abstract

Data analysis is a central task in the workflow of data scientists, researchers, software engineers, business analysts, and just about every professional who needs to work with data. The first mundane step in data analysis is preparing raw data, which can originate from a diverse variety of sources such as:

 logs from a web server,

 outputs from a scientific instrument,

 exported data from an online survey,

 data dump from a 1970s government database, or

 reports prepared by external consultants.

Keywords

Raw data; Data munging; Data wrangling; Data parsers; Assertions; Set data; Counter data; ...

Get Perspectives on Data Science for Software Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.