Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

O'Reilly logo
Strata Conference New York + Hadoop World 2012: Complete Video Compilation

Video Description

Explore the changes brought to technology and business by big data, data science, and pervasive computing with this complete video compilation of workshops and sessions from Strata Conference New York and Hadoop World 2012. With well over 100 hours of content, this video package includes the latest information on the skills, tools, and technologies you need to make data work—and build a data-driven business.

Table of Contents

  1. Thinking Big Together: Driving the Future of Data Science - Annika Jimenez and Anthony Goldbloom 00:09:34
  2. The End of the Data Warehouse - Ben Werther 00:05:19
  3. Beyond Batch - Doug Cutting 00:10:34
  4. Finance vs. Machine Learning - Cathy O'Neil 00:33:33
  5. From Traditional Database to Big Data Platform - Irfan Khan 00:06:16
  6. Moneyball for New York City - Michael Flowers 00:09:34
  7. The Democratization of Big Data: Bringing Hadoop to the Masses - James Markarian 00:06:00
  8. Hadoop: Thinking Big - John Schroeder 00:11:13
  9. Big Answers - Mike Olson 00:13:16
  10. The Composite Database - Rich Hickey 00:07:54
  11. The Human Face of Big Data - Rick Smolan 00:11:53
  12. Are We Really Winning the Information Revolution? - Samantha Ravich 00:14:38
  13. Big Data Direct The Era of Self-driven Big Data Exploration - Sharmila Shahani-Mulligan 00:13:41
  14. Bringing the 'So What' to Big Data - Tim Estes 00:15:52
  15. A Hands-on Introduction to Cross-disciplinary Analytics With Python - Part 1 - Roy Hyunjin Han 00:34:58
  16. A Hands-on Introduction to Cross-disciplinary Analytics With Python - Part 2 - Roy Hyunjin Han 00:48:25
  17. A Hands-on Introduction to Cross-disciplinary Analytics With Python - Part 3 - Roy Hyunjin Han 00:45:58
  18. A Hands-on Introduction to Cross-disciplinary Analytics With Python - Part 4 - Roy Hyunjin Han 00:36:22
  19. An Introduction to Hadoop - Part 1 - Mark Fei 00:46:55
  20. An Introduction to Hadoop - Part 2 - Mark Fei 00:40:37
  21. An Introduction to Hadoop - Part 3 - Mark Fei 00:38:29
  22. An Introduction to Hadoop - Part 4 - Mark Fei 00:50:41
  23. Testing Hadoop Applications - Part 1 - Tom Wheeler 00:37:01
  24. Testing Hadoop Applications - Part 2 - Tom Wheeler 00:53:32
  25. Testing Hadoop Applications - Part 3 - Tom Wheeler 00:47:22
  26. Testing Hadoop Applications - Part 4 - Tom Wheeler 00:42:52
  27. Using HBase - Part 1 - Amandeep Khurana and Matteo Bertozzi 1:00:25
  28. Using HBase - Part 2 - Amandeep Khurana and Matteo Bertozzi 00:58:35
  29. Eating at the Trough of Disillusionment - Alistair Croll 00:14:47
  30. What Do We Need to Teach Our Organizations About Big Data? - Robert 00:24:23
  31. Analytics for the Real World - Marshall Sponder 00:21:25
  32. Case Study: Big Data, Small(er) Company - Camille Fournier 00:14:34
  33. Every Visualization You've Seen is Worthless - Noah Iliinsky 00:15:34
  34. What Can Enterprises Learn from Startups? - Bjrn Herrmann 00:15:44
  35. Case Study: Changing the Culture of the Music Industry - David Boyle 00:14:54
  36. Case Study: Augmenting Humans to Make Better Policy Decisions - Sean Gourley 00:15:39
  37. Case Study: What's a Customer Worth? - Roberto Medri 00:16:04
  38. The Disappearing Interface: Case Studies in Augmented Humanity - JD Vogt 00:15:40
  39. Stuck in the Eighties: Why Marketers Still Don't Get Big Data - Tom Phillips 00:15:47
  40. Case Study: Data-Driven Door-to-Door Sales - Dirk Van den Poel and Dauwe Vercamer 00:17:38
  41. What Business People Need to Know About Data Governance - Micheline Casey 00:15:02
  42. How Much Privacy Can We Really Expect? - Mary Ludloff and Terence Craig 00:17:09
  43. Given Enough Monkeys - Some Thoughts on Randomness - Jesse Anderson 00:14:13
  44. Mainstream Big Data Through Storytelling - Kristian Hammond 00:12:59
  45. Linking Census and Enterprise Data Sets - Deborah Cooper 00:13:19
  46. Dealing with Dirty Data - Finding the Right Tool for the Job - Part 1 - Susan E. McGregor, Alice Brennan, and Michael Sullivan 00:46:09
  47. Dealing with Dirty Data - Finding the Right Tool for the Job - Part 2 - Susan E. McGregor, Alice Brennan, and Michael Sullivan 00:44:27
  48. Dealing with Dirty Data - Finding the Right Tool for the Job - Part 3 - Susan E. McGregor, Alice Brennan, and Michael Sullivan 1:03:17
  49. Building a Large-scale Data Collection System Using Flume NG - Part 1 - Hari Shreedharan, Will McQueen, Arvind Prabhakar, Prasad 1:00:38
  50. Building a Large-scale Data Collection System Using Flume NG - Part 2 - Hari Shreedharan, Will McQueen, Arvind Prabhakar, Prasad 1:07:03
  51. Building a Large-scale Data Collection System Using Flume NG - Part 3 - Hari Shreedharan, Will McQueen, Arvind Prabhakar, Prasad 00:45:07
  52. A Most Excellent Big Data Strategy - Bill Schmarzo 00:20:28
  53. Big Data Is Not Yet Another IT Project - Krish Krishnan 00:42:36
  54. Moving to Big Data: Strategies and Tactics for Setting Your Organization up for Success - Sheridan Hitchens 00:44:28
  55. Data Wrangling: Making People Productive with Data - Joe Hellerstein 00:43:09
  56. How To Plan a Successful Big Data Pilot - Michael Gold and Ryan McClarren 00:48:29
  57. Hadoop's Role in a Big Data Architecture - Jim Walker 00:47:32
  58. Not Just Hadoop: NoSQL in the Enterprise - Steve Francia 00:45:09
  59. Designing Data Visualizations Workshop - Part 1 - Noah Iliinsky 00:33:40
  60. Designing Data Visualizations Workshop - Part 2 - Noah Iliinsky 00:33:24
  61. Designing Data Visualizations Workshop - Part 3 - Noah Iliinsky 00:38:25
  62. Designing Data Visualizations Workshop - Part 4 - Noah Iliinsky 00:29:42
  63. Search and Real-time Analytics on Big Data - Part 1 - Sewook Wee, Ryan Tabora, and Jason Rutherglen 00:42:49
  64. Search and Real-time Analytics on Big Data - Part 2 - Sewook Wee, Ryan Tabora, and Jason Rutherglen 00:44:25
  65. Search and Real-time Analytics on Big Data - Part 3 - Sewook Wee, Ryan Tabora, and Jason Rutherglen 00:35:07
  66. Search and Real-time Analytics on Big Data - Part 4 - Sewook Wee, Ryan Tabora, and Jason Rutherglen 00:30:39
  67. Hadoop Data Warehousing with Hive - Part 1 - Dean Wampler 00:45:58
  68. Hadoop Data Warehousing with Hive - Part 2 - Dean Wampler 00:41:44
  69. Hadoop Data Warehousing with Hive - Part 3 - Dean Wampler 00:38:50
  70. Hadoop Data Warehousing with Hive - Part 4 - Dean Wampler 00:40:37
  71. Best Practices for Building and Deploying Predictive Models over Big Data - Part 1 - Robert Grossman and Collin Bennett 00:54:03
  72. Best Practices for Building and Deploying Predictive Models over Big Data - Part 2 - Robert Grossman and Collin Bennett 00:31:19
  73. Best Practices for Building and Deploying Predictive Models over Big Data - Part 3 - Robert Grossman and Collin Bennett 00:35:58
  74. Best Practices for Building and Deploying Predictive Models over Big Data - Part 4 - Robert Grossman and Collin Bennett 00:47:05
  75. 'Data Exponential' - K-12 Learning Analytics for Personalized Learning at Scale: Opportunities and Challenges - Roy Pea, Stephen 00:46:36
  76. Analyzing Millions of GitHub Commits: What Makes Developers Happy, Angry, and Everything in Between? - Ilya Grigorik and Brian D 00:46:21
  77. Best Practices for Publishing Data - Hjalmar Gislason 00:36:36
  78. Best Practices for Reproducible Research: A Case Study in Quantitative Finance - Chang She 00:38:01
  79. Beyond Hadoop: Fast Ad-Hoc Queries on Big Data - Mike Driscoll and Eric Tschetter 00:36:27
  80. Beyond Targeted Ads: Big Data for a Better World - Robert Kirkpatrick 00:35:57
  81. Big Data Analytics Platform at Nokia Selecting the Right Tool for the Right Workload - Yekesa Kosuru and Jim Tommaney 00:42:11
  82. Big Data for the Masses: How We Opened Up the Doors to Google's Dremel - Michael Manoochehri and Jim Caputo 00:38:40
  83. Big Data is a Hotbed of Thoughtcrime. So What? - Jim Adler 00:37:19
  84. Big Data: Turning the Information Overload into an Information Advantage - Chris Selland and Jerome Levadoux 00:36:24
  85. Big Data Wonderland: Two Views on the Big Data Revolution - Mark Madsen and Marc Demarest 00:54:54
  86. BizData Monetization: Turn Your Data into Dollars - Thomas Strachan 00:19:03
  87. Breeding Data Scientists - Amy O'Connor and Danielle Dean 00:45:14
  88. Building Rich, High Performance Tools for Practical Data Analysis - Wes McKinney 00:40:57
  89. Building the Next Platform for Analytic Apps in the Cloud - George Mathew 00:39:04
  90. Commercial Graph: A Map of Financial Relationships - Michael Radwin 00:38:57
  91. Creative Thinking and Data Science - Michael Stringer 00:27:15
  92. Continuous Experimentation with Continuous Deployment - Steve Mardenfeld 00:43:29
  93. Data Analysis for Explorers - Jesper Andersen 00:38:52
  94. Data Science on Hadoop: How Cloudera Impala Unlocks New Productivity and Insights - Justin Erickson and Marcel Kornacker 00:32:21
  95. Data Science with Hadoop at Opower - Erik Shilts 00:42:11
  96. Deconstructing the Database - Rich Hickey 00:43:44
  97. Demonstrating The Future of Data Science - Mike Maxey 00:26:38
  98. Deploy a Highly Available, Elastic, Multi-tenant Hadoop Cluster in 10 Minutes - Richard McDougall 00:48:57
  99. Designing Hadoop for the Enterprise Data Center - Jacob Rapp and Eric Sammer 00:39:17
  100. Designing for Data-driven Organizations - Bitsy Bentley 00:24:22
  101. Drive Smarter Decisions with Microsoft Big Data - Shawn Bice 00:43:18
  102. Explore/Exploit: Driving Business Value with Big Data - Raymie Stata 00:46:35
  103. GraphBuilder Scalable Graph Construction using Hadoop - Nilesh Jain 00:37:02
  104. HDFS - What is New and Future - Sanjay Radia and Todd Lipcon 00:47:07
  105. Hadoop as a Complementary Data Platform at PayPal - Moises Nascimento and Nagaraju Chayapathi 00:44:29
  106. Hadoop Analytics Without a Ph.D - Richard Daley 00:24:57
  107. Helping the World's Farmers Adapt to Climate Change - Siraj Khaliq 00:47:23
  108. hGraph: An Open System for Visualizing Personal Health Metrics - Juhan Sonin 00:32:43
  109. High Availability for the HDFS NameNode: Phase 2 - Aaron Myers and Todd Lipcon 00:40:46
  110. How Draw Something Absorbed 50 Million New Users, in 50 Days, With Zero App Downtime - Frank Weigel 00:39:07
  111. How to See Data - Kim Rees 00:46:02
  112. How a Traditional Media Company Embraced Big Data - Oscar Padilla, Franklin Rios, and Vineet Tyagi 00:46:17
  113. Is Your Cluster a Leaning Tower of Pisa? - Michael Segel 00:43:36
  114. Knitting Boar - Josh Patterson and Michael Katzenellenbogen 00:42:02
  115. Large Scale ETL with Hadoop - Eric Sammer 00:40:22
  116. Letting More Developers Dance with Elephants: What We Learned - Matt Winkler 00:46:01
  117. MapReduce Design Patterns - Donald Miner 00:38:08
  118. Maximizing ROI by Sharing your Hadoop Big Data Center - Rohit Valia 00:35:42
  119. Making Major League Data Work: Carving Up Big Data into Useful Applications for Specific Audiences - Richard Brath and Noah Schw 00:41:20
  120. Making Pig Fly: Optimizing Data Processing on Hadoop - Thejas Madhavan Nair and Jianyong Dai 00:44:10
  121. Moneyballing Criminal Justice: Using Data to Reduce Crime - Anne Milgram 00:47:26
  122. Monitoring Cloud Data - Gary Dusbabek 00:29:41
  123. Netflix's Evolving Data Science Architecture - Kurt Brown 00:39:36
  124. Of Rocket Ships and Washing Machines: Data Technology for People - Joe Hellerstein 00:10:45
  125. Performing Data Science with HBase - Aaron Kimball and Kiyan Ahmadizadeh 00:36:57
  126. Predictive Modeling and Operational Analytics over Streaming Data - Roger Barga 00:36:34
  127. Real-time Big Data Without Streaming - Ron Bodkin 00:39:46
  128. Real-time Learning with Bayesian Bandits - Ted Dunning 00:41:25
  129. Realtime Processing with Storm - Gabriel Eisbruch, Luis Daro Simonassi, and Jonathan Leibiusky 00:42:46
  130. Scala + Cascading = Scalding - Avi Bryant 00:42:47
  131. Scalable, Accessible, Predictive Analytics on Hadoop - Steven Hillion 00:42:48
  132. Searching for the Genetic Causes of Disease with Hadoop - Charles Schmitt 00:40:54
  133. Simple, Flexible Distributed Computing in Julia - Stefan Karpinski and Jeff Bezanson 00:46:08
  134. Start Small Before Going Big - Steve Yun and Joseph Rickert 00:38:07
  135. Storytelling with Data - Romy Misra 00:23:27
  136. Taming the Object Graph - Justin Moore 00:33:58
  137. The Art of Analytical Decomposition - Claudia Perlich 00:34:21
  138. The Death of the Enterprise Data Warehouse - Paul Groom 00:42:09
  139. The Language of Discovery: A Toolkit for Designing Big Data Interfaces and Interactions - Joe Lamantia 00:48:08
  140. They Don't Teach You That In School - Cathy O'Neil and Julie Steele 00:09:31
  141. This Message Will Self Destruct: The Implications of Self-Destructing Digital Data - Susan E. McGregor and Kathleen Duff 00:40:56
  142. Top 10 Things We Learned About Hadoop (since we started focusing on it) - Val Bercovici 00:37:02
  143. Trecul : Data Flow Processing Using LLVM-based JIT Compilation on Top of Hadoop - David Blair 00:43:27
  144. Turning Raw Data in Hadoop into Interactive BI (Capital One Labs Case Study) - Peter Schlampp 00:43:46
  145. Tying the Knot Between Hadoop and EDW - David Jonker 00:43:25
  146. Ubiquity, Interfaces, and Data: A Look Ahead to the Internet of Things - Rob Coneybeer 00:19:35
  147. UGD (User Generated Data), Product Development, and Privacy - Adrian Woodhead 00:47:19
  148. Using Data to Tune A Software Team - Jonathan Alexander 00:53:53
  149. Using Hadoop to do Agile Iterative ETL - Ben Werther and Kevin Beyer 00:43:07
  150. Visualizing Networks - Lynn Cherny 00:46:20
  151. Visualization An Emerging Collaboration Opportunity - Lee Feinberg 00:27:17
  152. Web Data Visualization: What's Becoming Easy, What's Becoming Possible - Kevin Lynagh, Kim Rees, Hadley Wickham, and David Nolen 00:32:56
  153. What Can We Learn from Billions of Foursquare Check-ins? - Blake Shaw 00:36:37
  154. Zillow: Disrupting the Real Estate Marketplace with Data - Stan Humphries 00:50:24