O'Reilly logo

Doing Data Science by Cathy O'Neil, Rachel Schutt

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 5. Logistic Regression

The contributor for this chapter is Brian Dalessandro. Brian works at Media6Degrees as a VP of data science, and he’s active in the research community. He’s also served as cochair of the KDD competition. M6D (also known as Media 6 Degrees) is a startup in New York City in the online advertising space. Figure 5-1 shows Brian’s data science profile—his y-axis is scaled from Clown to Rockstar.

Brian’s data science profile
Figure 5-1. Brian’s data science profile

Brian came to talk to the class about logistic regression and evaluation, but he started out with two thought experiments.

Thought Experiments

  1. How would data science differ if we had a “grand unified theory of everything”? Take this to mean a symbolic explanation of how the world works. This one question raises a bunch of other questions:

    • Would we even need data science if we had such a theory?
    • Is it even theoretically possible to have such a theory? Do such theories lie only in the realm of, say, physics, where we can anticipate the exact return of a comet we see once a century?
    • What’s the critical difference between physics and data science that makes such a theory implausible?
    • Is it just accuracy? Or more generally, how much we imagine can be explained? Is it because we predict human behavior, which can be affected by our predictions, creating a feedback loop?

      It might be useful to think of the sciences as a continuum, ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required