Preface

Summary

In this book, I address two core questions:

  • What does it mean for an organization to be data-driven?
  • How does an organization get there?

Many organizations think that simply because they generate a lot of reports or have many dashboards, they are data-driven. Although those activities are part of what an organization does, they are typically backward-looking. That is, they are often a declaration of past or present facts without a great deal of context, without causal explanation of why something has or has not happened, and without recommendations of what to do next. In short, they state what happened but they are not prescriptive. As such, they have limited upside.

In contrast, consider more forward-looking analyses, such as predictive models that optimize ad spend, supply chain replenishment, or minimize customer churn. They involve answering the “why” questions—or more generally, “w-questions”: who, what, when, why, and where—making recommendations and predictions, and telling a story around the findings. They are frequently a key driver in a data-driven organization. Those insights and recommendations, if acted upon, have a huge potential impact upon the organization.

However, such insights require collecting the right data, that the data is trustworthy, the analysis is good, that the insights are considered in the decision, and that they drive concrete actions so the potential can be realized. Phew! I call this sequence—the flow from collection to final impact—the analytics value chain.

This last step in the chain is critical. Analytics is not data-driven if its findings are never seriously considered or acted upon. If they are unread, ignored, and the boss is going to do whatever he or she wants to do, regardless of what the data says, then they are ineffectual. To be data-driven, an organization must have the right processes and the right culture in place to augment or drive critical business decisions with these analyses and therefore have a direct impact on the business.

Culture, then, is the key. This is a multifaceted problem that involves data quality and sharing, analyst hiring and training, communication, analytical organizational structure, metric design, A/B testing, decision-making processes, and more. This book will elucidate these ideas by providing insights and illustrative examples from a variety of industries. I also bring in the voices of experience in interviews that provide advice and insights of what worked and what didn’t from a variety of data science and analytics leaders. I hope to inspire all our readers to become more data-driven.

Moreover, throughout the book I emphasize the role that data engineers, analysts, and managers of analysts can play. I suggest that a data-driven organization and requisite culture can and should be built not only from top-down leadership but also from the bottom up. As Todd Holloway, head of data science at Trulia, remarked at the 2014 Chief Data Officer Executive Forum, “The best ideas come from the guys closest to the data.” Not only are they the ones who work directly with the data sources and who recognize and can remedy the data-quality issues and understand how best to augment the data, but “they often come up with the good product ideas.” In addition, they can help educate the rest of the organization to be more data literate. Part of that comes from developing their skill set and using it to do good work. Another part, however, comes from being more business savvy—learning the right questions to ask and business problems to tackle—and then “selling” their insights and recommendations to the decision-makers and making a compelling case of what the finding or recommendation means to business and why it makes an impact.

And there are great impacts and gains to be had. One report,1 controlling for other factors, found that data-driven organizations have a 5%–6% greater output and productivity that their less data-driven counterparts. They also had higher asset utilization, return on equity, and market value. Another report2 claims that analytics pays back $13.01 for every dollar spent. Being data-driven pays!

Data-drivenness is not a binary but rather a continuum: you can always be more data-driven, collect more high-quality relevant data, have a more skilled analytics organization, and do more testing. Moreover, you can always have a better decision-making process. In this book, I’ll discuss the hallmarks of great data-driven organizations. I’ll cover the infrastructure, skills, and culture needed to create organizations that take data, treat it as a core asset, and use it to drive and inform critical business decisions and ultimately make an impact. I will also cover some common anti-patterns, behavior that inhibits a business from making the most from its data.

The goals of the book, then, are to inspire the analyst organization to play its part, to provide pause for thought—to ask “are we making the most of our data?” and “can we be more data-driven?”—and to stimulate discussion about what more can be done to make use of this key resource. It is never too early to be thinking about this. Senior management and founders should be working to bake this into the very fabric of their company at an early stage. So, let’s find out more about what is entailed.

Who Should Read This Book?

The information here will help you build and run an internal analytics program, deciding what data to gather and store, how to access it and make sense of it, and, most crucially, how to act on it.

Whether you’re the only data scientist at a startup (and wearing a half-dozen other hats, to boot!), or the manager at an established organization with a room—or a department—full of people reporting to you, if you have data and the desire to act more quickly, efficiently, and wisely, this book will help you develop not just a data program but a data-driven culture.

Chapter Organization

Roughly speaking, this book is organized by thinking about that flow along that value chain. The first chapters cover data itself, in particular choosing the right data sources and ensuring that they are high quality and trustworthy. The next step in that chain is analysis. You need the right people with the right skills and tools to do good work, to generate impactful insights. I call this group the “analysts,” deliberately using the term in its broadest sense to cover data analysts, data scientists, and other members of the analytics organization. I do that to be inclusive because I believe that everyone, from a junior data analyst fresh out of school to a rockstar data scientist, has a role to play.I cover what makes a good analyst, how they can sharpen their skills, and also cover organizational aspects: how those analysts should be formed into teams and business units. The next few chapters cover the actual analytical work itself, such as performing analyses, designing metrics, A/B testing, and storytelling. I then proceed to the next step in the chain: making decisions with those analyses and insights. Here I address what makes decision-making hard and how it can be improved.

Throughout all these chapters, there is a very strong message and theme: being data-driven is not just about data or the latest big data toolset, but it is culture. Culture is the dominant aspect that sets expectations of how far data is democratized, how it is used and viewed across the organization, and the resources and training invested in using data as a strategic asset. Thus, I draw all the lessons from the various steps in the value chain into a single culture chapter. One of the later chapters then discusses top-down data leadership and in particular, the roles of two relatively new C-suite positions: the chief data officer and the chief analytics officer. However, culture can be shaped and influenced from the bottom up, too. Thus, throughout the book I directly address analysts and managers of analysts, highlighting what they can do to influence that culture and maximize their impact upon the organization. A true data-driven organization is a data democracy and has a large number of stakeholders who are vested in data, data quality, and the best use of data to make fact-based decisions and to leverage data for competitive advantage.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Safari® Books Online

Safari Books Online is an on-demand digital library that delivers expert content in both book and video form from the world’s leading authors in technology and business.

Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.

Safari Books Online offers a range of plans and pricing for enterprise, government, education, and individuals.

Members have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and hundreds more. For more information about Safari Books Online, please visit us online.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/data-driven-org.

To comment or ask technical questions about this book, send email to .

For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreilly

Follow us on Twitter: http://twitter.com/oreillymedia

Watch us on YouTube: http://www.youtube.com/oreillymedia

Acknowledgments

Despite having a single authorship, this book is really the sum of contributions, ideas, and help from a lot of great experts and colleagues in the field. I would like to thank the following people for their extremely helpful advice, suggestions, insights, and support: Andrew Abela, Peter Aiken, Tracy Allison Altman, Samarth Baskar, Lon Binder, Neil Blumenthal, Yosef Borenstein, Lewis Broome, Trey Causey, Brian Dalessandro, Greg Elin, Samantha Everitt, Mario Faria, Stephen Few, Tom Fishburne, Andrew Francis Freeman, Dave Gilboa, Christina Kim, Nick Kim, Anjali Kumar, Greg Linden, Stephen Few, Jason Gowans, Sebastian Gutierrez, Doug Laney, Shaun Lysen, Doug Mack, Patrick Mahoney, Chris Maliwat, Mikayla Markrich, Lynn Massimo, Sanjay Mathur, Miriah Meyer, Julie-Jennifer Nguyen, Scott Pauly, Jeff Potter, Matt Rizzo, Max Schron, Anna Smith, Nellwyn Thomas, Daniel Tunkelang, James Vallandingham, Satish Vedantam, Daniel White, and Dan Woods. Thanks too, in general, to my colleagues at Warby Parker, who were all very supportive. Sincerest apologies to anyone whom I may have inadvertently missed from this list.

I would especially like to thank Daniel Mintz, Julie Steele, Dan Woods, Lon Binder, and June Andrews, who acted as a technical reviewers and who all provided very thorough, sage, and concrete suggestions that greatly helped improve the book.

Thanks also to the organizers from Data Driven Business, especially Antanina Kapchonava, and the participants of the Chief Data Officer Executive Forum held in New York City on November 12, 2014.

James Vallandingham generously re-created and modified Figure 4-1 especially for this book. Thanks, Jim!

I would like to thank Sebastian Gutierrez for some interesting conversation and for letting me steal some examples from his excellent data-visualization course.

I would also like to recognize the support of my friends and family, especially my long-suffering wife, Alexia, who described herself as a “book widow” at month two; and my mother, who has been incredibly supportive throughout my life.

Finally, I would like to extend my gratitude to all the great staff of O’Reilly, especially Tim McGovern, who edited the work and who pushed and shaped it in all the right places. Thanks, too, to Mike Loukides, Ben Lorica, Marie Beaugureau, and especially the production team: Colleen Lobner, Lucie Haskins, David Futato, Kim Cofer, Ellie Volckhausen, Amanda Kersey, and Rebecca Demarest.

1 Brynjolfsson, E., L. M. Hitt, and H. H. Kim. “Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance?” Social Science Research Network (2011).

2 Nucleus Research, “Analytics pays back $13.01 for every dollar spent,” O204 (Boston, MA: Nucleus Research, 2014), 5.

Get Creating a Data-Driven Organization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.