You are previewing Strata Conference Santa Clara 2014: Complete Video Compilation, one of over 35,000 titles on Safari
O'Reilly logo
Strata Conference Santa Clara 2014: Complete Video Compilation

Video Description

Gain a clear perspective on the future of big data—and all the analytics, architectures, techniques, tools, and technologies you need to use data successfully. With this complete video compilation, you’ll get a front-row seat to the keynotes, workshops, and sessions at O’Reilly’s Strata Conference Santa Clara 2014. You can download these videos or stream them through our HD player.

Table of Contents

  1. Tutorials
    1. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 1
    2. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 2
    3. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 3
    4. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 4
    5. IPython In Depth - Brian Granger and Fernando Prez - Part 1
    6. IPython In Depth - Brian Granger and Fernando Prez - Part 2
    7. IPython In Depth - Brian Granger and Fernando Prez - Part 3
    8. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 1
    9. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 2
    10. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 3
    11. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 4
    12. Design Thinking for Dummies (Data Scientists) - Michael Stringer, Dean Malmgren, and Laurie Skelly - Part 1
    13. Design Thinking for Dummies (Data Scientists) - Michael Stringer, Dean Malmgren, and Laurie Skelly - Part 2
    14. Design Thinking for Dummies (Data Scientists) - Michael Stringer, Dean Malmgren, and Laurie Skelly - Part 3
    15. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 1
    16. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 2
    17. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 3
    18. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 4
    19. Introduction to Hadoop 2.0 - Rich Raposa - Part 1
    20. Introduction to Hadoop 2.0 - Rich Raposa - Part 2
    21. Introduction to Hadoop 2.0 - Rich Raposa - Part 3
    22. Introduction to Hadoop 2.0 - Rich Raposa - Part 4
    23. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 1
    24. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 2
    25. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 3
    26. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 4
    27. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 1
    28. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 2
    29. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 3
    30. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 4
    31. Effective Data Science With Scalding - Vitaly Gordon - Part 1
    32. Effective Data Science With Scalding - Vitaly Gordon - Part 2
    33. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 1
    34. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 2
    35. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 3
    36. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 4
    37. Adviser: Learning How to get A Second Opinion on Your Analysis when it's Important to get it Right - Leland Wilkinson - Part 1
    38. Adviser: Learning How to get A Second Opinion on Your Analysis when it's Important to get it Right - Leland Wilkinson - Part 2
    39. Adviser: Learning How to get A Second Opinion on Your Analysis when it's Important to get it Right - Leland Wilkinson - Part 3
    40. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 1
    41. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 2
    42. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 3
    43. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 4
    44. Data Transformation: Skills of the Agile Data Wrangler - Joe Hellerstein, and Jeffrey Heer - Part 1
    45. Data Transformation: Skills of the Agile Data Wrangler - Joe Hellerstein, and Jeffrey Heer - Part 2
  2. Hardcore Data Science
    1. Hardcore Data Science Opening Remarks - Ben Lorica
    2. Extreme Machine Learning - Alexander Gray
    3. What the #@)*$ is Big Data? A Holistic View of Data and Algorithms - Alice Zheng
    4. Overcoming the Barriers to Production-Ready Machine-Learning Workflows - Henrik Brink, and Joshua Bloom
    5. Anomaly Detection - Ted Dunning
    6. Neural Networks for Machine Perception - Ilya Sutskever
    7. The Predictive Business - Kira Radinsky
    8. Can We Make Big Data Management Easier? - Magda Balazinska
    9. Design Challenges for Real Predictive Platforms - Max Gasner
    10. Machine Learning Gremlins - Ben Hamner
    11. Algebra for Scalable Analytics - Oscar Boykin
  3. Data-Driven Business Day
    1. Introduction to Data Driven Business Day - Alistair Croll
    2. Those Numbers Wont Measure Themselves - Farrah Bostic
    3. Social Data Intelligence: Integrating Social and Enterprise Data for Competitive Advantage - Susan Etlinger
    4. Open Data: Its Not Just for Governments - Jen van der Meer
    5. The Insight Economy - Krista Schnell
    6. 9 Levers for Converting Big Data and Analytics into Results - Christy Maver
    7. Deploying a Data Sciences Team -- The Promise and the Pitfalls - Diane Chang
    8. Sensing Best Practices - Ben Waber
    9. Leveraging Value from Open Data Through Collaboration -Peter Pirnejad
    10. Becoming a Learning Organization: From Data Teams to Corporate Influence - Pamela Peele
    11. Making Big Data Small - Baron Schwartz
    12. Big Data Meets Big Infrastructure: Going Underground in One Major European City - Narendra Mulani
    13. The Era of Data-Powered Government - Beth Blauer
    14. TripIt Uses Data to Organize Itineraries, No Matter Where You Book - Edith Harbaugh
  4. Keynotes
    1. Crossing the Chasm: What's New, What's Not - Geoffrey Moore
    2. Evolution from Apache Hadoop to the Enterprise Data Hub - Amr Awadallah
    3. Collecting Massive Data via Crowdsourcing - John Schitka
    4. Empowering Personalized Learning with Big Data - Ramona Pierson
    5. Hadoop in 5 Minutes or Less - John Schroeder
    6. People are Data Too - Farrah Bostic
    7. Bringing Big Data to One Billion People - Quentin Clark
    8. Small Data in Sports: Little Differences that Mean Big Outcomes - David Epstein
    9. The Art of Good Practice - Rodney Mullen
    10. Big Data Moonshots and Ground Control - Joe Hellerstein and Tutti Taygerly
    11. Data Science and Smart Systems: Creating the Digital Brain - Kaushik Das
    12. How Companies are Using Spark, and Where the Edge in Big Data Will Be - Matei Zaharia
    13. In-Hadoop Analytics: Bringing analytics to big data - Anjul Bhambhri
    14. Record Linkage and Other Statistical Models for Quantifying Conflict Casualties in Syria - Megan Price
    15. Ben Fry Keynote
    16. Survivorship Bias and the Psychology of Luck - David McRaney
  5. Sessions
    1. Apache Hadoop and the Emergence of the Enterprise Data Hub - Eli Collins
    2. Information Visualization for Large-Scale Data Workflows - Michael Conover
    3. Adaptive Adversaries: Building Systems to Fight Fraud and Cyber Intruders - Ari Gesher
    4. Fighting Global Cybercrime and BotNets using Big Data - Bryan Hurd and Herain Oberoi
    5. Navigating the Big Data Vendor Landscape - Edd Dumbill
    6. Best Practices for Hadoop In Production - Panel Discussion Facilitated by Forrester Analyst - Mike Gualtieri
    7. Thorn in the Side of Big Data: Too Few Artists - Chris Re
    8. 10,000: The Most Dangerous Number in Sports - David Epstein
    9. You're Halfway There: Moving from Insight to Action - Bob Filbin
    10. Building the Next Generation Data Architecture with Hadoop, Data Warehouse & Data Discovery Platform - Bill Franks
    11. Minority Report Meets Big Data: Touch and Interactive Big Data is Here - Justin Langseth, and Eva Andreasson
    12. Machine Learning for Social Change - Fernand Pajot
    13. Harness Data in Real-Time with Infinite Storage - Yuvaraj Athur Raghuvir
    14. You Don't Need to Boil the Big Data Ocean with Hadoop - Ben Werther, and Sanjay Mathur
    15. Predictive Modeling in the Cloud with Scikit-learn and IPython - Olivier Grisel
    16. Mining Student Notes in Real Time to Provide Study Guides - Perry Samson
    17. Thinking with Data - Max Shron
    18. Building a Data-centered Data Center for Agile Development - Justin Makeig
    19. Evolving Data Governance for the Big Data Enterprise - Scott Lee and Rachel Haines
    20. Making Big Data Cost Effective in a Bare Metal Cloud - Harold Hannon
    21. How Evernote Does Conversion Using Hadoop Analytics - Damon Cool
    22. Crowdsourcing at Locu: How I Learned to Stop Worrying and Love the Crowd - Adam Marcus
    23. Building a Lightweight Discovery Interface for Chinese Patents - Eric Pugh
    24. Superconductor: Scaling Charts with Design and GPUs - Leo Meyerovich
    25. Break Down Data Silos with Apache Accumulo - Adam Fuchs
    26. Organizing Big Data with the Crowd - Lukas Biewald
    27. Scalable PostgreSQL as your data platform - Ben Redman
    28. Unlocking the Secrets of Gertrude Stein - Ian Timourian
    29. A Different Look at Data and Security - Learning to Live with Fear - Pablos Holman
    30. Stand Back, I'm Going To Try Science! - Rachel Poulsen and John Akred
    31. Collaborative Advanced Analytics For Big Data - Bruno Aziza
    32. Network Science Made Simple: SNA for Pie Chart Makers - Marc Smith
    33. How Twitter Monitors Millions of Time-series - Yann Ramin
    34. Harvard's Clean Energy Project: Big Data Maps To Renewable Energy - Kai Trepte
    35. Working With Time Series Data Using Apache Cassandra - Patrick McFadin
    36. Friending Graph Analytics: Large-Scale Graph Processing Made Easy - Ted Willke
    37. Transforming Search Engine Marketing at Ask.com - Mohit Sati
    38. Music Videos and Gastronomification for Big Data Analysis - Brian Abelson, and Thomas Levine
    39. Soylent Mean: Data Science is Made of People - Cameran Hetrick and Kimberly Stedman
    40. Big Data: Beyond Bare-Metal? - Mike Wendt
    41. Secrets of Apache Hive Queries and UDFs - Shrikanth Shankar
    42. Twitter and HP HAVEn: The Big Data Big Picture - Sanjay Goil
    43. Data Science How to Build and Deploy a Team of Data Scientists - Diane Chang, Steven Hillion, Nick Kolegraff, and Matthew Gee
    44. The Netflix Data Platform - A Recipe for High Business Impact - Kurt Brown
    45. Bedtime Stories: Learning from Sleep Data - Monica Rogati
    46. Tracking a Soccer Game with Big Data - Srinath Perera
    47. Data Transformation: A User-Centric Approach to Accessing and Analyzing Big Data - Joe Hellerstein
    48. Apache Hadoop 2.0: Migration from 1.0 to 2.0 - Vinod Kumar Vavilapalli
    49. Getting a Handle on Hadoop and its Potential to Catalyze a New Information Architecture Model - Milan Vaclavik
    50. The Sidekick Pattern: Using Small Data to Increase the Value of Big Data - Abe Gong
    51. Exascale Data Analytics @ Facebook - Sambavi Muthukrishnan
    52. Sending Millions of Surveys Around the World on Mobile Phones - Max Richman
    53. Business Data Lake: An Evolution in Data Infrastructure - Jeffrey Kelly, Steven Hirsch, Steve Jones, and Sabrina Dahlgren
    54. Expressing Yourself in R - Hadley Wickham
    55. Data Journalism - Organized Crime and Corruption Reporting - Drew Sullivan
    56. The Inflection Point - Hadoop and Big Data Analytics - Anjul Bhambhri
    57. Spreadsheets: The Dark Matter of Big Data - Felienne Hermans
    58. Scale-Invariant Intelligence - Vin Sharma
    59. Probabilistic Programming: What, Why, How, and When - Beau Cronin
    60. Beyond Hadoop MapReduce: Interactive Advertising Insights with Shark @ Yahoo! - Nandu Jayakumar and Tim Tully
    61. Machine Learning for Machine Data - David Andrzejewski - Part 1
    62. Machine Learning for Machine Data - David Andrzejewski - Part 2
    63. Lessons from the Trenches: edo Interactive Leverages Hadoop to Build Customer Loyalty - Rob Rosen, and Tim Garnto
    64. The IPython Notebook: Get Close to Your Data with Python and JavaScript - Brian Granger
    65. Government Data on Both Sides of the Bridge - Moderated by: Jesse Robbins - Panelists: Shannon Spanhake and Eddie Tejeda
    66. Enabling Business Transformation with Analytics over Real-time Streaming Data - Anand Venugopal, and Pranay Tonpay
    67. The Next Wave of SQL-on-Hadoop: Building a Virtual EDW on Native Hadoop Data - Marcel Kornacker
    68. How Comcast Turns Big Data into Real-Time Operational Insights - Patrick Shumate
    69. Chicago Bars, Prisoners Dilemma, and Practical Models in Search -Chris Harland
    70. Big Industrial Internet Data: Connecting and Optimizing at New Scales - Steven Gustafson and Parag Goradia - Part 1
    71. Big Industrial Internet Data: Connecting and Optimizing at New Scales - Steven Gustafson, and Parag Goradia - Part 2
    72. FAST and FURIOUS Big Data Analytics Meets Hadoop - Wayne Thompson, and Paul Kent
    73. The Urgent Need to Appify Big Data - Ryan Cunningham
    74. Unboxing Data Startups - Michael Abbott
    75. Apache Hive & Stinger: Petabyte Scale SQL, IN Hadoop - Owen O'Malley, and Alan Gates
    76. Querying Petabytes of Data in Seconds - Reynold Xin, and Sameer Agarwal
    77. The Need for Speed & Scale: A Database for Real-Time Analytics - Eric Frenkiel
    78. Graph All The Things! 11: Graph Database Use Cases That Aren't Social - Emil Eifrem
    79. Graph Analysis with One Trillion Edges on Apache Giraph - Avery Ching
    80. Big Data for Big Power: Smart Meters does not mean Smart Grids - Brett Sargent
    81. The Last Mile: Challenges and Opportunities in Data Tools - Wes McKinney
    82. Are We Data Scientists or Data Janitors? - Nenshad Bardoliwalla
    83. Session with Ben Fry
    84. Data for Good - Moderated by: Jake Porway - Panelists: Drew Conway, Rayid Ghani, and Elena Eneva
    85. NonStop HBase - Making HBase Continuously Available for Enterprise Deployment - Jagane Sundar
    86. Apache Mesos as an SDK for Building Distributed Frameworks - Paco Nathan
    87. Agile Analytics - Neal Ford
    88. Socializing Search. Professionally. - Sriram Sankar, and Daniel Tunkelang
    89. Big Data for Better Data Centers - Krishna Raj Raja and Balaji Parimi
    90. One Size Does Not Fit All: Analyzing Data at Scale with AWS - Rahul Pathak
    91. Making Choices: What Kind of Relationship are You Seeking with Your Database? - J.R. Arredondo
    92. StatusWolf: Creating Dashboards That Don't Suck Using Art and Engineering - Mark Troyer
    93. Real-Time Analytics with NewSQL: Why Hadoop is not enough - Raj Bains
    94. MLbase: Distributed Machine Learning Made Easy - Ameet Talwalkar and Evan Sparks
    95. Real-time Analytics with Open Source Technologies - Fangjin Yang, and Gian Merlino