O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Strata + Hadoop World Conference in Barcelona 2014: Complete Video Compilation

Video Description

Immerse yourself in the world of data

Unable to attend Strata + Hadoop World Conference in Barcelona 2014? This complete video compilation will get up to speed on every keynote, workshop, session, and lightning demo held at the conference. You’ll explore solutions to your most challenging problems and find out what’s new in emerging technologies and Apache Hadoop—and see for yourself what data can do.

Download these videos or stream them through our HD player, and view presentations from scores of experienced data practitioners from finance, media, government, and education. You’ll gain a clear perspective on the future of big data, including all the analytics, architectures, techniques, tools, and technologies you need to use data successfully.

Topics include:

  • Business & Industry: How organizations of all sizes use data to make better decisions
  • Data Science: Everything from the latest algorithms and advances in machine learning to cultural change and team-building
  • Design: Capturing user experience, design, new interfaces, and visualization
  • Hadoop Platform: A deep dive into the dominant big data stack, with practical lessons and integration tricks
  • Internet of Things: Extracting meaningful insights from data collected and generated by things
  • Privacy, Law & Ethics: Issues on governance, ethics, and compliance in the era of open data
  • Tools & Technology: How tools like Cassandra, Storm, Accumulo, Kafka, and Spark fit in the data science toolkit

Table of Contents

  1. Keynotes
    1. Open Data Center of the Future - Mike Olson (Cloudera) 00:14:41
    2. Data Driven Design at F1 Speed - Geoff McGrath (McLaren Applied Technologies) 00:15:43
    3. Big Data 3.0 - Rod Smith (IBM Emerging Internet Technologies) 00:10:14
    4. Hiding Information Inside Big Data, and the Hypocrisy of Privacy - Alicia Asin (Libelium) 00:10:02
    5. Data and Product and Tech, Oh My! - Camille Fournier (Rent the Runway) 00:13:35
    6. Mission Critical Big Data - David Richards (WANdisco, Inc.) 00:05:02
    7. #IoTH: The Internet of Things and Humans - Tim O'Reilly (O'Reilly Media, Inc.) 00:21:08
    8. Understanding Decisions Driven by Big Data: From Analytics Management to Privacy-friendly Cloaking Devices - Foster Provost ( NYU | Stern ) 00:15:07
    9. What is a Data Lake, Anyway? - Martin Willcox (Teradata) 00:15:12
    10. Yes, Open Data Has Value! - Majken Sander (T. Hansen Gruppen A/S) 00:15:32
    11. Predictive Analytics in the Cloud: Predicting Football - Jordan Tigani (Google ) 00:13:55
    12. Embracing the Human Element - Rodney Mullen (Almost Skateboards) 00:11:21
    13. What's the Big Deal About City Data? - Francine Bennett (Mastodon C) 00:09:17
    14. Storytelling and Science - Ben Okri (Self) 00:14:26
  2. Business & Industry sessions
    1. High Level Abstractions Make Big Data Useful for Real People - Melissa Santos (Etsy) 00:37:53
    2. Old Dogs, New Tricks: How Data-driven Intrapreneurs Make Big Companies Innovate - Alistair Croll (Solve For Interesting) 00:35:42
    3. Beyond Clicks: Analyzing User Engagement with Content Discovery Platform - Roy Sasson (Outbrain) 00:35:02
    4. Understanding your Unicorns: Data Science Team Building in Action - Kim Nilsson (pivigo academy) 00:44:55
    5. Case study: The Benefits and Challenges of Running in the Cloud - Marton Trencseni (Prezi) 00:38:33
    6. Telling Meaningful Stories With Data - Daniel Waisberg (Google) 00:26:25
    7. The Unlikely Match between Financial Data and Open Innovation - Marcelo Soria-Rodriguez (BBVA Data & Analytics) 00:41:46
    8. Welcoming Machine Learning into the Heart of a Creative Business - David Boyle (BBC Worldwide), Amanda Hill (BBC Worldwide), and Dan Jabry (CrowdEmotion) 00:39:41
    9. Data Governance for Regulated Industries - Amir Halfon (ScalingData) 00:28:11
    10. Model Workers: How leading companies are securing and creating value from their data talent - Juan Mateos Garcia (Nesta) 00:43:09
    11. How to Datafy Your Business - Carme Artigas (Synergic Partners) 00:39:38
  3. Hadoop & Beyond sessions
    1. Architectural Considerations for Hadoop Applications (Using Clickstream Analytics as an Example) - Part 1 1:19:10
    2. Architectural Considerations for Hadoop Applications (Using Clickstream Analytics as an Example) - Part 2 1:27:23
    3. SAMOA: A Platform for Mining Big Data Streams - Gianmarco De Francisci Morales (Yahoo Labs) 00:41:12
    4. Building an Intelligent Big Data App in 30 minutes - Claudiu Barbura (Atigeo) and David Talby (Atigeo, LLC) 00:43:30
    5. Spark Streaming Case Studies - Paco Nathan (Databricks) 00:42:35
    6. Identifying Outliers at Scale Using Real-time Search Engines - Costin Leau (Elasticsearch) 1:06:04
    7. Yarns about YARN: Migrating to MapReduce v2 - Kathleen Ting (Cloudera) 00:35:22
    8. HBase for Architects - Nick Dimiduk (Hortonworks, Inc) 00:42:44
    9. From Pilot to Production: Building a Data Infrastructure - John Akred (Silicon Valley Data Science) 00:44:00
    10. Resistance is Futile: The Next Generation Big Data Architecture - Jim Scott (MapR) 00:46:10
  4. Data Science sessions
    1. Exploratory Data Analysis with Apache Spark - Hossein Falaki (Databricks Inc.) 00:42:03
    2. A Gentle Introduction to Apache Spark and Clustering for Anomaly Detection - Sean Owen (Cloudera) 00:35:06
    3. Data Science Toolbox and the Importance of Reproducible Research - Jeroen Janssens (Elsevier) 00:39:42
    4. Building a Unified Data Pipeline in Spark - Aaron Davidson (Databricks) 00:39:22
    5. Search Query Categorization at Scale - Alex Dorman (Magnetic) and Michal Laclavik (Magnetic) 00:29:48
    6. Linking Data Without Common Identifiers - Lars Marius Garshol (Bouvet) 00:43:45
    7. Realtime Data Analysis Patterns - Mikio Braun (streamdrill) 00:39:24
    8. Get Productive with Predictive Applications. Unleash Your Inner Data Scientist - Shawn Scully (Graphlab) 00:49:15
    9. Automating Machine Learning Systems: Lessons Learned - Ofer Ron (LivePerson) 00:38:20
    10. Doing the Impossible, Almost (A survey of approximation algorithms that make queries vastly faster) - Ted Dunning (MapR Technologies) 00:46:01
  5. Design sessions
    1. Making Data Human - Jesús Gorriti (Fjord) 00:36:45
    2. Data and Design: We're All Invited on the Data Journey - Juliette Melton (New York Times) 00:43:10
    3. The Data Future - Kim Rees (Periscopic) 00:41:06
    4. Challenges in Developing Contextual Applications - Hakan Jonsson (Sony Mobile Communications) 00:36:36
    5. From Confusing to Convincing: A Framework for Using Animation and Storytelling to Bolster the Effectiveness of Interactive Visualizations - Michael Freeman (Institute for Health Metrics and Evaluation, University of Washington) 00:39:39
  6. Government/Open Data sessions
    1. Using Data for GOOD - Francine Bennett (Mastodon C) and Duncan Ross (Teradata) 00:43:47
    2. Crunching Common Crawl with the Cloud-Based MIA Platform - Lisa Green (Common Crawl) and Peter Adolphs (Neofonie) 00:42:29
    3. Happy City: Shortest Urban Paths or Shortcuts to Happiness? - Daniele Quercia (Yahoo Labs) 00:33:52
    4. Making the Work of Fire Fighters Safer with Information Awareness - Bart van Leeuwen (netage.nl) 00:38:49
  7. Hadoop Platform sessions
    1. Beyond the Hammer: Using Multiple Tools to Simplify Big Data Solutions - Guy Ernest (Amazon Web Services) 00:36:32
    2. A Survey of HBase Application Archetypes - Lars George (Cloudera) and Jonathan Hsieh (Cloudera, Inc) 00:39:20
    3. Petascale Genomics - Uri Laserson (Cloudera) 00:43:22
    4. Moving Towards a Streaming Architecture - Garry Turkington (Improve Digital) and Gabriele Modena (Improve Digital) 00:39:53
    5. Driving Personalization with Real Time Big Data Analytics - Ameya Kantikar (Groupon) 00:41:02
    6. From Raw Data to Analytics with No ETL - Marcel Kornacker (Cloudera, Inc.) 00:27:15
    7. Enterprise Hadoop Architecture – Lessons from Cisco’s Hadoop Journey - Floris Grandvarlet (Cisco) 00:44:29
    8. From BI to Big Data at Solocal - Abed Ajraou (Solocal) 00:40:04
    9. Hadoop and Pediatric Healthcare: Bedside Vitals and Better Babies - Tod Davis (Children's Healthcare of Atlanta) 00:41:24
    10. RT-Giraph: Online Graph Mining Simplified - Georgos Siganos (Qatar Computing Research Institute) 00:42:07
  8. Internet of Things sessions
    1. Intel's Cloud Wearable & IoT Analytics Platform - Assaf Araki (Intel) 00:32:43
    2. Will the Hordes of IoT Data Bring the Post-Hadoop Era and Democratize Data Stores? - Jodok Batlogg (CRATE Technology GmbH) 00:29:22
  9. Privacy, Law & Ethics sessions
    1. Behavioral Analytics with Smartphone Data - Joerg Blumtritt (Datarella) 00:39:09
    2. Forecasting Space-time Events - Jeremy Heffner (Azavea) 00:42:56
    3. Unraveling Myths of Digital Privacy & Advertising - Joshua Koran (Turn) 00:37:30
    4. A Framework of Purpose and Consent for Data Security and Consumer Privacy - Aurelie Pols (Mind Your Group) 00:35:32
  10. Data-Driven Business Day
    1. Data-Driven Business Day Introduction - Simon Wardley 00:28:02
    2. It Ain’t What You Do To Data, It’s What You Do With It - Edd Dumbill (Silicon Valley Data Science) 00:32:40
    3. The Secret Lives of Cities - Francine Bennett (Mastodon C) 00:26:16
    4. Big Data Will Destroy More Jobs than It Creates - Roger Magoulas (O'Reilly Media) and Robert "r0ml" Lefkowitz (Sharewave) 00:35:48
    5. How to Train a Dragon - Sharon Biggar (Social Point) 00:24:40
    6. Big Data at King - Vince Darley (King Digital Entertainment) 00:29:58
    7. Moving from the Tactical to the Strategic - Amy Heineike (Quid) 00:21:35
    8. How Big Data is Changing User Engagement - Mounia Lalmas (Yahoo Labs) 00:30:01
    9. Digging into Predictive Analytics with Fine-grained Behavior Data - Foster Provost (Stern) 00:23:06
    10. APIs and Unlocking the Value of Your Data - Steven Willmott (3scale networks) 00:17:54
    11. Big Data Architecture: Build vs Buy Dilemma - Carme Artigas (Synergic Partners) 00:22:33
    12. Life: A Data Story - Marcelo Soria-Rodriguez (BBVA Data & Analytics) 00:18:44
  11. Spark Camp
    1. Spark Camp - Part 1 1:12:37
    2. Spark Camp - Part 2 1:20:24
    3. Spark Camp - Part 3 1:47:22
    4. Spark Camp - Part 4 00:59:22
  12. Sponsored sessions
    1. Analytics 3.0 - Rod Smith (IBM Emerging Internet Technologies) 00:33:55
    2. The Internet of Trains - Frank Saeuberlich (Teradata) 00:51:14
    3. HDFS for Geographically Distributed File System - Konstantin Shvachko (WANdisco) 00:46:57
    4. Practical Big Data Blending with Pentaho Data Integration. Blend Your Data for Deeper Insights - Matt Casters (Pentaho) 00:31:05
    5. Configuring a Secure, Multi-Tenant Hadoop Cluster For The Enterprise - James Kinley (Cloudera) 00:48:05
    6. Why Enterprise IT Management Tools are Essential for Big Data Success - Joe Goldberg (BMC Software Inc.) 00:36:52
    7. For Red Hat, it's 1994 all over again - Greg Kleiman (Red Hat) 00:29:34
    8. Splunk at UniCredit: Our Big Data Journey from Daily Troubleshooting to Business Analytics - Marcello Bianchetti (UniCredit SPA) 00:45:48
    9. The ART of Data Governance and Security - Bob Middleton (Tableau Software) 00:37:51
    10. Integrating Big Data into a Programming Language - Tomas Petricek (University of Cambridge) 00:41:11