Cover image for Analyzing the Analyzers

Book description

There has been intense excitement in recent years around activities labeled "data science," "big data," and "analytics." However, the lack of clarity around these terms and, particularly, around the skill sets and capabilities of their practitioners has led to inefficient communication between "data scientists" and the organizations requiring their services. This lack of clarity has frequently led to missed opportunities. To address this issue, we surveyed several hundred practitioners via the Web to explore the varieties of skills, experiences, and viewpoints in the emerging data science community.

We used dimensionality reduction techniques to divide potential data scientists into five categories based on their self-ranked skill sets (Statistics, Math/Operations Research, Business, Programming, and Machine Learning/Big Data), and four categories based on their self-identification (Data Researchers, Data Businesspeople, Data Engineers, and Data Creatives). Further examining the respondents based on their division into these categories provided additional insights into the types of professional activities, educational background, and even scale of data used by different types of Data Scientists.

In this report, we combine our results with insights and data from others to provide a better understanding of the diversity of practitioners, and to argue for the value of clearer communication around roles, teams, and careers.

Table of Contents

  1. Analyzing the Analyzers
  2. 1. Introduction
  3. 2. Case Studies in Miscommunication
    1. Rock Stars and Gods
    2. Apples and Oranges
  4. 3. A Survey of, and About, Professionals
    1. Clustering Data Scientists
      1. Self-Identification
      2. Skills
      3. Combining Skills and Self-ID
    2. The Variety of Data Scientists
      1. Data Businesspeople
      2. Data Creatives
      3. Data Developer
      4. Data Researchers
    3. Big Data
    4. Related Surveys
  5. 4. T-Shaped Data Scientists
    1. Evidence for T-Shaped Data Scientists
  6. 5. Data Scientists and Organizations
    1. Where Data People Come From: Science vs. Tools Education
    2. From Theory to Practice: Internships and Mentoring
    3. Teams and Org Charts
    4. Career Paths
  7. 6. Final Thoughts
  8. A. Survey Details
    1. Design and Invitation
    2. Skills List
    3. Non-negative Matrix Factorization
    4. Acknowledgements
  9. About the Authors
  10. Colophon
  11. Copyright