Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

O'Reilly logo
Strata Conference Santa Clara 2014: Complete Video Compilation

Video Description

Gain a clear perspective on the future of big data—and all the analytics, architectures, techniques, tools, and technologies you need to use data successfully. With this complete video compilation, you’ll get a front-row seat to the keynotes, workshops, and sessions at O’Reilly’s Strata Conference Santa Clara 2014. You can download these videos or stream them through our HD player.

Table of Contents

  1. Tutorials
    1. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 1 00:47:17
    2. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 2 00:42:36
    3. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 3 00:49:02
    4. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 4 00:35:03
    5. IPython In Depth - Brian Granger and Fernando Prez - Part 1 1:03:59
    6. IPython In Depth - Brian Granger and Fernando Prez - Part 2 00:50:08
    7. IPython In Depth - Brian Granger and Fernando Prez - Part 3 00:47:57
    8. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 1 00:43:02
    9. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 2 00:46:05
    10. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 3 00:53:31
    11. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 4 00:33:13
    12. Design Thinking for Dummies (Data Scientists) - Michael Stringer, Dean Malmgren, and Laurie Skelly - Part 1 00:21:53
    13. Design Thinking for Dummies (Data Scientists) - Michael Stringer, Dean Malmgren, and Laurie Skelly - Part 2 00:21:54
    14. Design Thinking for Dummies (Data Scientists) - Michael Stringer, Dean Malmgren, and Laurie Skelly - Part 3 00:39:54
    15. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 1 00:47:06
    16. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 2 00:46:06
    17. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 3 00:44:08
    18. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 4 00:29:29
    19. Introduction to Hadoop 2.0 - Rich Raposa - Part 1 00:52:03
    20. Introduction to Hadoop 2.0 - Rich Raposa - Part 2 00:33:39
    21. Introduction to Hadoop 2.0 - Rich Raposa - Part 3 00:57:53
    22. Introduction to Hadoop 2.0 - Rich Raposa - Part 4 00:40:29
    23. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 1 00:37:57
    24. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 2 00:40:39
    25. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 3 00:48:23
    26. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 4 00:35:47
    27. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 1 00:33:13
    28. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 2 00:47:45
    29. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 3 00:42:18
    30. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 4 00:43:17
    31. Effective Data Science With Scalding - Vitaly Gordon - Part 1 00:43:03
    32. Effective Data Science With Scalding - Vitaly Gordon - Part 2 00:48:15
    33. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 1 00:45:34
    34. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 2 00:29:23
    35. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 3 00:35:19
    36. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 4 00:37:06
    37. Adviser: Learning How to get A Second Opinion on Your Analysis when it's Important to get it Right - Leland Wilkinson - Part 1 00:45:31
    38. Adviser: Learning How to get A Second Opinion on Your Analysis when it's Important to get it Right - Leland Wilkinson - Part 2 00:39:39
    39. Adviser: Learning How to get A Second Opinion on Your Analysis when it's Important to get it Right - Leland Wilkinson - Part 3 1:00:01
    40. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 1 00:28:29
    41. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 2 00:33:34
    42. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 3 00:45:00
    43. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 4 00:43:31
    44. Data Transformation: Skills of the Agile Data Wrangler - Joe Hellerstein, and Jeffrey Heer - Part 1 00:44:58
    45. Data Transformation: Skills of the Agile Data Wrangler - Joe Hellerstein, and Jeffrey Heer - Part 2 00:49:36
  2. Hardcore Data Science
    1. Hardcore Data Science Opening Remarks - Ben Lorica 00:02:16
    2. Extreme Machine Learning - Alexander Gray 00:44:58
    3. What the #@)*$ is Big Data? A Holistic View of Data and Algorithms - Alice Zheng 00:42:34
    4. Overcoming the Barriers to Production-Ready Machine-Learning Workflows - Henrik Brink, and Joshua Bloom 00:25:23
    5. Anomaly Detection - Ted Dunning 00:31:19
    6. Neural Networks for Machine Perception - Ilya Sutskever 00:29:58
    7. The Predictive Business - Kira Radinsky 00:37:39
    8. Can We Make Big Data Management Easier? - Magda Balazinska 00:41:28
    9. Design Challenges for Real Predictive Platforms - Max Gasner 00:31:19
    10. Machine Learning Gremlins - Ben Hamner 00:30:59
    11. Algebra for Scalable Analytics - Oscar Boykin 00:32:22
  3. Data-Driven Business Day
    1. Introduction to Data Driven Business Day - Alistair Croll 00:07:32
    2. Those Numbers Wont Measure Themselves - Farrah Bostic 00:20:51
    3. Social Data Intelligence: Integrating Social and Enterprise Data for Competitive Advantage - Susan Etlinger 00:18:23
    4. Open Data: Its Not Just for Governments - Jen van der Meer 00:19:51
    5. The Insight Economy - Krista Schnell 00:19:29
    6. 9 Levers for Converting Big Data and Analytics into Results - Christy Maver 00:11:33
    7. Deploying a Data Sciences Team -- The Promise and the Pitfalls - Diane Chang 00:16:21
    8. Sensing Best Practices - Ben Waber 00:22:28
    9. Leveraging Value from Open Data Through Collaboration -Peter Pirnejad 00:17:45
    10. Becoming a Learning Organization: From Data Teams to Corporate Influence - Pamela Peele 00:15:25
    11. Making Big Data Small - Baron Schwartz 00:19:29
    12. Big Data Meets Big Infrastructure: Going Underground in One Major European City - Narendra Mulani 00:11:13
    13. The Era of Data-Powered Government - Beth Blauer 00:19:19
    14. TripIt Uses Data to Organize Itineraries, No Matter Where You Book - Edith Harbaugh 00:11:58
  4. Keynotes
    1. Crossing the Chasm: What's New, What's Not - Geoffrey Moore 00:13:34
    2. Evolution from Apache Hadoop to the Enterprise Data Hub - Amr Awadallah 00:05:35
    3. Collecting Massive Data via Crowdsourcing - John Schitka 00:05:13
    4. Empowering Personalized Learning with Big Data - Ramona Pierson 00:09:41
    5. Hadoop in 5 Minutes or Less - John Schroeder 00:05:19
    6. People are Data Too - Farrah Bostic 00:05:56
    7. Bringing Big Data to One Billion People - Quentin Clark 00:10:01
    8. Small Data in Sports: Little Differences that Mean Big Outcomes - David Epstein 00:09:18
    9. The Art of Good Practice - Rodney Mullen 00:09:40
    10. Big Data Moonshots and Ground Control - Joe Hellerstein and Tutti Taygerly 00:10:41
    11. Data Science and Smart Systems: Creating the Digital Brain - Kaushik Das 00:10:56
    12. How Companies are Using Spark, and Where the Edge in Big Data Will Be - Matei Zaharia 00:11:21
    13. In-Hadoop Analytics: Bringing analytics to big data - Anjul Bhambhri 00:06:58
    14. Record Linkage and Other Statistical Models for Quantifying Conflict Casualties in Syria - Megan Price 00:10:20
    15. Ben Fry Keynote 00:09:59
    16. Survivorship Bias and the Psychology of Luck - David McRaney 00:18:54
  5. Sessions
    1. Apache Hadoop and the Emergence of the Enterprise Data Hub - Eli Collins 00:39:22
    2. Information Visualization for Large-Scale Data Workflows - Michael Conover 00:36:03
    3. Adaptive Adversaries: Building Systems to Fight Fraud and Cyber Intruders - Ari Gesher 00:42:23
    4. Fighting Global Cybercrime and BotNets using Big Data - Bryan Hurd and Herain Oberoi 00:38:08
    5. Navigating the Big Data Vendor Landscape - Edd Dumbill 00:43:43
    6. Best Practices for Hadoop In Production - Panel Discussion Facilitated by Forrester Analyst - Mike Gualtieri 00:38:19
    7. Thorn in the Side of Big Data: Too Few Artists - Chris Re 00:39:48
    8. 10,000: The Most Dangerous Number in Sports - David Epstein 00:39:28
    9. You're Halfway There: Moving from Insight to Action - Bob Filbin 00:40:18
    10. Building the Next Generation Data Architecture with Hadoop, Data Warehouse & Data Discovery Platform - Bill Franks 00:36:17
    11. Minority Report Meets Big Data: Touch and Interactive Big Data is Here - Justin Langseth, and Eva Andreasson 00:41:00
    12. Machine Learning for Social Change - Fernand Pajot 00:30:20
    13. Harness Data in Real-Time with Infinite Storage - Yuvaraj Athur Raghuvir 00:38:02
    14. You Don't Need to Boil the Big Data Ocean with Hadoop - Ben Werther, and Sanjay Mathur 00:38:52
    15. Predictive Modeling in the Cloud with Scikit-learn and IPython - Olivier Grisel 00:37:47
    16. Mining Student Notes in Real Time to Provide Study Guides - Perry Samson 00:52:58
    17. Thinking with Data - Max Shron 00:35:40
    18. Building a Data-centered Data Center for Agile Development - Justin Makeig 00:43:30
    19. Evolving Data Governance for the Big Data Enterprise - Scott Lee and Rachel Haines 00:41:11
    20. Making Big Data Cost Effective in a Bare Metal Cloud - Harold Hannon 00:41:29
    21. How Evernote Does Conversion Using Hadoop Analytics - Damon Cool 00:30:40
    22. Crowdsourcing at Locu: How I Learned to Stop Worrying and Love the Crowd - Adam Marcus 00:24:19
    23. Building a Lightweight Discovery Interface for Chinese Patents - Eric Pugh 00:40:12
    24. Superconductor: Scaling Charts with Design and GPUs - Leo Meyerovich 00:22:52
    25. Break Down Data Silos with Apache Accumulo - Adam Fuchs 00:21:06
    26. Organizing Big Data with the Crowd - Lukas Biewald 00:14:20
    27. Scalable PostgreSQL as your data platform - Ben Redman 00:33:11
    28. Unlocking the Secrets of Gertrude Stein - Ian Timourian 00:41:38
    29. A Different Look at Data and Security - Learning to Live with Fear - Pablos Holman 00:42:09
    30. Stand Back, I'm Going To Try Science! - Rachel Poulsen and John Akred 00:20:20
    31. Collaborative Advanced Analytics For Big Data - Bruno Aziza 00:39:40
    32. Network Science Made Simple: SNA for Pie Chart Makers - Marc Smith 00:16:21
    33. How Twitter Monitors Millions of Time-series - Yann Ramin 00:34:50
    34. Harvard's Clean Energy Project: Big Data Maps To Renewable Energy - Kai Trepte 00:36:38
    35. Working With Time Series Data Using Apache Cassandra - Patrick McFadin 00:15:47
    36. Friending Graph Analytics: Large-Scale Graph Processing Made Easy - Ted Willke 00:21:35
    37. Transforming Search Engine Marketing at Ask.com - Mohit Sati 00:41:15
    38. Music Videos and Gastronomification for Big Data Analysis - Brian Abelson, and Thomas Levine 00:37:59
    39. Soylent Mean: Data Science is Made of People - Cameran Hetrick and Kimberly Stedman 00:36:24
    40. Big Data: Beyond Bare-Metal? - Mike Wendt 00:32:09
    41. Secrets of Apache Hive Queries and UDFs - Shrikanth Shankar 00:42:14
    42. Twitter and HP HAVEn: The Big Data Big Picture - Sanjay Goil 00:39:33
    43. Data Science How to Build and Deploy a Team of Data Scientists - Diane Chang, Steven Hillion, Nick Kolegraff, and Matthew Gee 00:39:05
    44. The Netflix Data Platform - A Recipe for High Business Impact - Kurt Brown 00:42:40
    45. Bedtime Stories: Learning from Sleep Data - Monica Rogati 00:37:58
    46. Tracking a Soccer Game with Big Data - Srinath Perera 00:36:47
    47. Data Transformation: A User-Centric Approach to Accessing and Analyzing Big Data - Joe Hellerstein 00:38:50
    48. Apache Hadoop 2.0: Migration from 1.0 to 2.0 - Vinod Kumar Vavilapalli 00:53:05
    49. Getting a Handle on Hadoop and its Potential to Catalyze a New Information Architecture Model - Milan Vaclavik 00:42:05
    50. The Sidekick Pattern: Using Small Data to Increase the Value of Big Data - Abe Gong 00:30:51
    51. Exascale Data Analytics @ Facebook - Sambavi Muthukrishnan 00:44:54
    52. Sending Millions of Surveys Around the World on Mobile Phones - Max Richman 00:40:18
    53. Business Data Lake: An Evolution in Data Infrastructure - Jeffrey Kelly, Steven Hirsch, Steve Jones, and Sabrina Dahlgren 00:42:01
    54. Expressing Yourself in R - Hadley Wickham 00:34:59
    55. Data Journalism - Organized Crime and Corruption Reporting - Drew Sullivan 00:38:49
    56. The Inflection Point - Hadoop and Big Data Analytics - Anjul Bhambhri 00:44:00
    57. Spreadsheets: The Dark Matter of Big Data - Felienne Hermans 00:44:19
    58. Scale-Invariant Intelligence - Vin Sharma 00:39:19
    59. Probabilistic Programming: What, Why, How, and When - Beau Cronin 00:38:55
    60. Beyond Hadoop MapReduce: Interactive Advertising Insights with Shark @ Yahoo! - Nandu Jayakumar and Tim Tully 00:41:03
    61. Machine Learning for Machine Data - David Andrzejewski - Part 1 00:44:50
    62. Machine Learning for Machine Data - David Andrzejewski - Part 2 00:44:46
    63. Lessons from the Trenches: edo Interactive Leverages Hadoop to Build Customer Loyalty - Rob Rosen, and Tim Garnto 00:36:15
    64. The IPython Notebook: Get Close to Your Data with Python and JavaScript - Brian Granger 00:45:34
    65. Government Data on Both Sides of the Bridge - Moderated by: Jesse Robbins - Panelists: Shannon Spanhake and Eddie Tejeda 00:42:01
    66. Enabling Business Transformation with Analytics over Real-time Streaming Data - Anand Venugopal, and Pranay Tonpay 00:35:52
    67. The Next Wave of SQL-on-Hadoop: Building a Virtual EDW on Native Hadoop Data - Marcel Kornacker 00:47:05
    68. How Comcast Turns Big Data into Real-Time Operational Insights - Patrick Shumate 00:42:06
    69. Chicago Bars, Prisoners Dilemma, and Practical Models in Search -Chris Harland 00:38:01
    70. Big Industrial Internet Data: Connecting and Optimizing at New Scales - Steven Gustafson and Parag Goradia - Part 1 00:34:24
    71. Big Industrial Internet Data: Connecting and Optimizing at New Scales - Steven Gustafson, and Parag Goradia - Part 2 00:34:18
    72. FAST and FURIOUS Big Data Analytics Meets Hadoop - Wayne Thompson, and Paul Kent 00:41:34
    73. The Urgent Need to Appify Big Data - Ryan Cunningham 00:30:41
    74. Unboxing Data Startups - Michael Abbott 00:38:50
    75. Apache Hive & Stinger: Petabyte Scale SQL, IN Hadoop - Owen O'Malley, and Alan Gates 00:41:01
    76. Querying Petabytes of Data in Seconds - Reynold Xin, and Sameer Agarwal 00:37:19
    77. The Need for Speed & Scale: A Database for Real-Time Analytics - Eric Frenkiel 00:37:05
    78. Graph All The Things! 11: Graph Database Use Cases That Aren't Social - Emil Eifrem 00:20:25
    79. Graph Analysis with One Trillion Edges on Apache Giraph - Avery Ching 00:34:09
    80. Big Data for Big Power: Smart Meters does not mean Smart Grids - Brett Sargent 00:36:03
    81. The Last Mile: Challenges and Opportunities in Data Tools - Wes McKinney 00:18:31
    82. Are We Data Scientists or Data Janitors? - Nenshad Bardoliwalla 00:39:13
    83. Session with Ben Fry 00:36:02
    84. Data for Good - Moderated by: Jake Porway - Panelists: Drew Conway, Rayid Ghani, and Elena Eneva 00:46:25
    85. NonStop HBase - Making HBase Continuously Available for Enterprise Deployment - Jagane Sundar 00:35:22
    86. Apache Mesos as an SDK for Building Distributed Frameworks - Paco Nathan 00:20:40
    87. Agile Analytics - Neal Ford 00:19:26
    88. Socializing Search. Professionally. - Sriram Sankar, and Daniel Tunkelang 00:39:42
    89. Big Data for Better Data Centers - Krishna Raj Raja and Balaji Parimi 00:40:28
    90. One Size Does Not Fit All: Analyzing Data at Scale with AWS - Rahul Pathak 00:19:17
    91. Making Choices: What Kind of Relationship are You Seeking with Your Database? - J.R. Arredondo 00:35:12
    92. StatusWolf: Creating Dashboards That Don't Suck Using Art and Engineering - Mark Troyer 00:32:52
    93. Real-Time Analytics with NewSQL: Why Hadoop is not enough - Raj Bains 00:30:24
    94. MLbase: Distributed Machine Learning Made Easy - Ameet Talwalkar and Evan Sparks 00:39:48
    95. Real-time Analytics with Open Source Technologies - Fangjin Yang, and Gian Merlino 00:33:35