PREFACE

Newspapers and blogs are now filled with discussions about “big data,” massive amounts of largely unstructured data generated by behavior that is electronically recorded. “Big data” was the central theme at the 2012 meeting of the World Economic Forum and the U.S. Government issued a Big Data Research and Development Initiative the same year. The American Statistical Association has also made the topic a theme for the 2012 and 2013 Joint Statistical Meetings.

Paradata are a key feature of the “big data” revolution for survey researchers and survey methodologists. The survey world is peppered with process data, such as electronic records of contact attempts and automatically captured mouse movements that respondents produce when answering web surveys. While not all of these data sets are massive in the usual sense of “big data,” they are often highly unstructured, and it is not always clear to those collecting the data which pieces are relevant, and how they should be analyzed. In many instances it is not even obvious which data are generated.

Recently Axel Yorder, the CEO of the company Webtrends, pointed out that just as “Gold requires mining and processing before it finds its way into our jewelry, electronics, and even the Fort Knox vault […] data requires collection, mining and, finally, analysis before we can realize its true value for businesses, governments, and individuals alike.”1 The same can be said for paradata. Paradata are data generated in the process of ...

Get Improving Surveys with Paradata: Analytic Uses of Process Information now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.