You are previewing Strata Conference Santa Clara 2013: Complete Video Compilation.
O'Reilly logo
Strata Conference Santa Clara 2013: Complete Video Compilation

Video Description

Didn’t make it to Strata Santa Clara 2013? No problem. This complete video compilation puts you front and center at every keynote, session, and tutorial from the biggest Strata Conference to date. With more than 100 presentations from today’s leading big data practitioners, you’ll learn the latest approaches to data-driven business, data design, data science, new data from an increasingly connected world, and many other issues.

Table of Contents

  1. Tutorials
    1. Just the Basics: Core Data Science Skills with Kaggle's Top Competitors - William Cukierski and Ben Hamner - Part 1 00:37:28
    2. Just the Basics: Core Data Science Skills with Kaggle's Top Competitors - William Cukierski and Ben Hamner - Part 2 00:35:51
    3. Just the Basics: Core Data Science Skills with Kaggle's Top Competitors - William Cukierski and Ben Hamner - Part 3 00:54:43
    4. Using HBase effectively - What You Need to Know as an Application Developer - Jonathan Hsieh and Himanshu Vashishtha - Part 1 00:56:58
    5. Using HBase effectively - What You Need to Know as an Application Developer - Jonathan Hsieh and Himanshu Vashishtha - Part 2 00:43:59
    6. Using HBase effectively - What You Need to Know as an Application Developer - Jonathan Hsieh and Himanshu Vashishtha - Part 3 1:04:26
    7. Using HBase effectively - What You Need to Know as an Application Developer - Jonathan Hsieh and Himanshu Vashishtha - Part 4 00:44:29
    8. Hadoop Data Warehousing with Hive - Dean Wampler - Part 1 00:48:43
    9. Hadoop Data Warehousing with Hive - Dean Wampler - Part 2 00:36:42
    10. Hadoop Data Warehousing with Hive - Dean Wampler - Part 3 00:51:30
    11. Hadoop Data Warehousing with Hive - Dean Wampler - Part 4 00:35:40
    12. An Introduction to the Berkeley Data Analytics Stack (BDAS) Featuring Spark, Spark Streaming, and Shark - Ion Stoica - Part 1 00:31:49
    13. An Introduction to the Berkeley Data Analytics Stack (BDAS) Featuring Spark, Spark Streaming, and Shark - Matei Zaharia - Part 2 1:04:56
    14. An Introduction to the Berkeley Data Analytics Stack (BDAS) Featuring Spark, Spark Streaming, and Shark - Shivaram Venkataraman 00:52:36
    15. An Introduction to the Berkeley Data Analytics Stack (BDAS) Featuring Spark, Spark Streaming, and Shark - Tathagata Das - Part 4 00:38:16
    16. D3.js tutorial - Scott Murray and Jerome Cukier - Part 1 00:41:03
    17. D3.js tutorial - Scott Murray and Jerome Cukier - Part 2 00:45:09
    18. D3.js tutorial - Scott Murray and Jerome Cukier - Part 3 1:03:19
    19. D3.js tutorial - Scott Murray and Jerome Cukier - Part 4 00:15:53
    20. Think Like a Data Journalist: How the Guardian Turns Data into Stories Every Day - Simon Rogers and Feilding Cage - Part 1 00:47:50
    21. Think Like a Data Journalist: How the Guardian Turns Data into Stories Every Day - Simon Rogers and Feilding Cage - Part 2 00:35:10
    22. Think Like a Data Journalist: How the Guardian Turns Data into Stories Every Day - Simon Rogers and Feilding Cage - Part 3 00:33:35
    23. Think Like a Data Journalist: How the Guardian Turns Data into Stories Every Day - Simon Rogers and Feilding Cage - Part 4 00:53:21
    24. Python for Data Analysis - Wes McKinney - Part 1 00:48:03
    25. Python for Data Analysis - Wes McKinney - Part 2 00:40:32
    26. Python for Data Analysis - Wes McKinney - Part 3 00:42:14
    27. Python for Data Analysis - Wes McKinney - Part 4 00:44:21
    28. Introduction to Apache Hadoop - Sarah Sproehnle - Part 1 00:43:38
    29. Introduction to Apache Hadoop - Sarah Sproehnle - Part 2 00:49:43
    30. Introduction to Apache Hadoop - Sarah Sproehnle - Part 3 00:41:57
    31. Introduction to Apache Hadoop - Sarah Sproehnle - Part 4 00:35:37
    32. Google Cloud for Data Crunchers - Ryan Boyd, Michael Manoochehri, Julia Ferraioli, and Takashi Matsu - Part 1 00:21:12
    33. Google Cloud for Data Crunchers - Ryan Boyd, Michael Manoochehri, Julia Ferraioli, and Takashi Matsu - Part 2 00:22:35
  2. Big Data for Enterprise IT Day
    1. Revolution or Evolution - Mark Madsen and Marc Demarest 00:54:52
    2. The Big Data Business Model Maturity Index - Bill Schmarzo 00:29:54
    3. Big Data is a Business Problem - Krish Krishnan 00:37:31
    4. A Fundamentally Different Approach to Hadoop Based Enterprise Analytics Architecture - Jeff Denworth 00:05:51
    5. Deep Learning - The Biggest Data Science Breakthrough of the Decade - Jeremy Howard 00:45:14
    6. Data Wrangling: Making People Productive with Data - Joe Hellerstein 00:39:17
    7. The Laws of Data Mining - Duncan Ross 00:46:03
    8. How to Interview a Data Scientist - Daniel Tunkelang 00:34:45
    9. The Science of Managing Data Scientists - Kate Matsudaira 00:28:25
    10. Keep Your Data Science Efforts from Derailing - Marck Vaisman and Sean Murphy 00:26:51
  3. Data Driven Business Day
    1. The Business Singularity: Why Software Means Cycle Time Trumps Scale - Alistair Croll 00:13:22
    2. Canary in the Coalmine: How Social Data Can Prepare Us for Big Data - Susan Etlinger 00:29:09
    3. Data is Not a Business Model: Moving Knowledge to Action - Jen van der Meer 00:22:58
    4. Getting Big Benefits from Big Data - Jeanne Harris 00:26:58
    5. Real-time Big Data Analytics: From Deployment to Production - David Smith 00:21:16
    6. Big Data Analytics Survey: How Enterprises are REALLY Using Big Data - Rebecca Shockley 00:16:12
    7. We Have Fancy Math, Now What? - Timothy Mohn 00:18:06
    8. Data and Campaigns: A Conversation with Obama For America Chief Scientist Rayid Ghani - Rayid Ghani 00:44:00
    9. Turning a Telco's Network Data into Insights for Retailers - Ruben Lara Hernandez and Michael Fishwick 00:27:02
    10. Implementing a Simpler, More Agile Big Data Architecture with Hadoop, NoSQL DB and Search for a New Breed of Real-time Applicati 00:19:13
    11. Putting Big Data to Work in Retail - Increasing Margins, Avoiding Food Waste - Jan Karstens and Johanna Fleckner 00:19:57
    12. Let the Data Decide: Predictive Analytics in Healthcare - Eugene Kolker 00:27:35
    13. Data and the Hurricane: Data-Driven Response to Sandy - Ari Gesher 00:20:16
    14. Secure Analytics on the Cloud - Khaled El Emam 00:18:27
    15. Data-Driven Disaster Relief - Molly Turner and Riley Newman 00:18:08
    16. Flying with Elephants: How CAASD Uses Hadoop to Mine Aviation Data - Marcio Silva 00:19:11
  4. Keynotes
    1. Video Games: The Biggest Big Data Challenge - Rajat Taneja 00:09:15
    2. Hadoop: The Foundation for Change - Scott Yara 00:14:10
    3. Committing to Recommendation Algorithms - Eric Colson 00:08:47
    4. Hadoop: Big Results - John Schroeder 00:05:32
    5. Big Data on Small Devices: Data Science goes Mobile - Yael Garten 00:11:04
    6. Using Data to Honor the Human Right to Education - Prasad Ram 00:06:01
    7. Moneyballing Government - Jennifer Pahlka 00:06:46
    8. Getting Big Benefits from Big Data (Keynote) - Jeanne Harris 00:14:02
    9. Xbox Data is XXL - Dave Campbell 00:11:33
    10. Broad Data: What Happens When the Web of Data Becomes Real? - James Hendler 00:08:49
    11. Delivering Intelligence Wherever Data Lives - Girish Juneja 00:06:14
    12. Grafting Hadoop and SAP HANA Together - Joydeep Das 00:07:42
    13. Human Fault-tolerance - Nathan Marz 00:07:57
    14. Algorithmic Illusions: Hidden Biases of Big Data - Kate Crawford 00:17:26
    15. Big Data vs The Beltway: The Regulatory Risks to Data-Driven Businesses - Kenneth Cukier 00:19:36
    16. The Victory Lab - Sasha Issenberg 00:19:51
  5. Sessions
    1. Dodging the Digital Creep Factor - Shelley Evenson 00:34:54
    2. Location Intelligence Targets Information for Development - Stewart Collis 00:44:30
    3. How to Transform your Business by Choosing the Right Big Data Stack - Billy Bosworth and Sean Knapp 00:40:43
    4. Expect More From Hadoop - Ted Dunning 00:42:21
    5. Beyond Hadoop MapReduce: Interactive Analytic Insights Using Spark - Sharmila Shahani-Mulligan, Matei Zaharia, and Stephanie McR 00:38:27
    6. Agile Data Wrangling and Web-based Visualizations - Chang She 00:41:02
    7. Introduction to Forecasting - Michael Bailey 00:40:54
    8. Funnel Analysis in Hadoop at Etsy - Matt Walker, Wil Stuckey, and Steve Mardenfeld 00:37:38
    9. Sci vs. Sci: Attack Vectors for Black-hat Data Scientists, and Possible Countermeasures - Joseph Turian 00:28:28
    10. Public Health Case Study: Tracking Zombies and Vampires using Social Media - John Feland 00:32:39
    11. Strengthening the Bond Between Hadoop and Your Analytic Database - Joydeep Das 00:45:51
    12. Revolutionizing Governance and eDiscovery with Entity Analytics - Tim Estes and Brandon Daniels 00:46:56
    13. An Introduction to Apache Drill - Tomer Shiran 00:34:33
    14. Facet: The Recursive Approach to Visualization - Vadim Ogievetsky 00:38:07
    15. Next-Gen Data Scientists - Rachel Schutt 00:37:21
    16. Building Scalable Big Data Infrastructure Using Open Source Software - Sam William 00:35:49
    17. Who is Fake? Discover Astroturfing or Attempts of Fake Influence! - Lutz Finger 00:51:14
    18. The Hadoop Data Reservoir - Requirements and Pitfalls - Peter Schlampp 00:40:16
    19. Demonstrating the High-Performance Future of Hadoop - Josh Klahr and Gavin Sherry 00:37:32
    20. Sketching Techniques for Real-time Big Data - Bahman Bahmani 00:37:17
    21. Designed for Insight: Principles of Big Data Design - Douglas van der Molen 00:33:36
    22. The IPython Notebook: a Comprehensive Tool for Data Science - Brian Granger 00:40:20
    23. Coordinating the Many Tools of Big Data in Hadoop - Alan Gates 00:39:24
    24. How to Crowdsource Large Scale Identity Theft and Fraud to Make Bucket Loads of Easy Money - Jo Prichard 00:40:09
    25. Real Time Network Analytics with Storm - Mauricio Vacas, Fausto Inestroza, and Sonali Parthasarathy 00:36:18
    26. Data Visualization Design Using Shneiderman's Mantra: Overview First, Zoom and Filter, Then Details-on-Demand - Eric Legrand and 00:37:34
    27. Pervasive Data Munging Gremlins - Bradley Voytek 00:37:48
    28. Building Tools for the Hadoop Developer - Matt Winkler 00:35:34
    29. Big Data is a Hotbed of Thoughtcrime, Part II: The Code - Jim Adler 00:36:22
    30. Zooniverse: Web-scale Citizen Science - Arfon Smith 00:43:15
    31. Implementing Big Data at the Speed of Business - Raanan Dagan and Rahul Deshmukh 00:40:55
    32. Learning's Clarion Call: Teaming to Improve US Education with Big Data Science - Marie Bienkowski, Jace Kohlmeier, Zachary Pardo 00:41:22
    33. The Future of Relational (or Why You Can't Escape SQL) - Tim O'Brien 00:44:03
    34. Using Web Standards to Create Interactive Data Visualizations - Nicolas Garcia Belmonte 00:33:14
    35. What To Do When Your Machine Learning Gets Attacked - Vishwanath Ramarao 00:39:47
    36. Tricks for Distributed System Debugging and Diagnosis - Philip Zeyliger 00:33:33
    37. Strategies for Avoiding Big Privacy 00:52:31
    38. Bigger Than Any One - Solving Large-Scale Data Problems with People and Machines - Tyler Bell 00:43:32
    39. SQL on Hadoop: Defining the New Generation of Analytic Databases - Carl Steinbach 00:44:20
    40. Leveraging Hadoop Data: High Performance Analytics On ParAccel - John Santaferraro and Walt Maguire 00:38:44
    41. Petascale Processing On An Open Source Budget: An Introduction to QFS - Jim Kelly 00:35:01
    42. Using Every Pixel to Visualize Big Data - Lynwood Bishop 00:44:47
    43. Real-time Stream Processing and Visualization Using Kafka, Storm, and d3.js - Justin Langseth and Byron Ellis 00:39:29
    44. How Hadoop in the Cloud Affects Developer-Friendly Decision Making - Philip Kromer 00:35:44
    45. Crowdfunded Open Doctor Data - Fred Trotter 00:36:25
    46. Medical Data: Going from Hospitals to Home - Carson Darling 00:38:42
    47. When Energy Met Intelligence: Utilities Using Hadoop for Analytics at Scale - Greg Khairallah and Bert Haskell 00:45:57
    48. Tools to Turn Emergencies into Knowledge: Turning 911 into 411 - Eron Kelly and Paul Henderson 00:37:03
    49. Maps Not Lists: Network Graphs for Data Exploration - Amy Heineike 00:38:41
    50. Feedback Control for Programmers and Other Strangers - Philipp Janert 00:33:37
    51. Big Data Tag-Team: Hadoop and the Data Warehouse - Shaun Connolly and Tasso Argyros 00:39:40
    52. Monkeys and Math: How MailChimp Catches Bad Guys - John Foreman 00:34:39
    53. Sociometric Badges: Using Wearable Sensors to Change Management - Ben Waber 00:45:30
    54. Data Set Management System for Hadoop - Michael Lang Sr. and Michael Lang Jr. 00:37:03
    55. Ready for Primetime? What Enterprise-ready Really Means - Charles Zedlewski 00:44:42
    56. High-Volume Data Collection and Real Time Analytics Using Redis - C. Aaron Cois and Tim Palko 00:34:30
    57. Four Pillars of Effective Visualizations - Noah Iliinsky 00:40:23
    58. Real-World Machine Learning on Big Data: Which Method(s) Should You Use? - Alexander Gray 00:43:17
    59. The Workflow Abstraction - Paco Nathan 00:41:18
    60. A Model Strategy for Data Journalism in a Country Without Open Data - Sandra Crucianelli and Angelica Peralta Ramos 00:41:15
    61. Deriving an Interest Graph for Social Data - Anna Smith 00:23:34
    62. Data Science vs. Analytics - Approaches to Problem Solving - Nick Kolegraff 00:29:07
    63. The Rise of the Scientific Databases - John A. De Goes 00:39:20
    64. Language Technologies for a Connected World: Processing and Visualizing Unstructured Text in 5000 Languages - Robert Munro 00:37:25
    65. Third Generation Tools for Realizing Machine Learning Algorithms - Dr. Vijay Srinivas Agneeswaran 00:35:54
    66. The BigData Top100 List - Milind Bhandarkar and Chaitan Baru 00:40:10
    67. The Web As The Greatest Dataset Of All Time - Lisa Green, Greg Lindahl, and Kevin Burton 00:38:18
    68. Dotting the I's with Hadoop on Eseries - David Henry and Benjamin Lloyd 00:45:04
    69. Using Hadoop to Expand Data Warehousing - Mike Peterson 00:41:17
    70. Impala: A Modern SQL Engine for Hadoop - Justin Erickson 00:39:57
    71. Introducing Julia - a New Open Source Mathematical Programming Language - Michael Bean 00:32:46
    72. Building Recommendation Platforms with Hadoop - Jayant Shekhar 00:45:26
    73. Design, Transparency, and Big Data in Civil Litigation - Dean Malmgren and Michael Stringer 00:27:47
    74. Big Data from Small Devices: Using Smartphones to Understand Human Behavior - Nadav Aharony 00:42:05
    75. Big Data on the Open Cloud - Natasha Gajic 00:33:50
    76. Five Real World Hadoop Success Stories with HP - Sanjai Marimadaiah, Luis Maldonado, and Jerome Levadoux 00:37:21
    77. Druid: Interactive Queries Meet Real-time Data - Eric Tschetter and Danny Yuan 00:44:10
    78. Great Debate: Design Matters More Than Math - Alexander Gray, Monica Rogati, Julie Steele, and Douglas van der Molen 00:46:15