O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Strata Data Conference 2017 - Singapore

Video Description

Strata Singapore 2017 offered a powerful over-the-horizon look at the future of business and technology—where it's going and how to get there first—in a series of presentations by big data experts based in Singapore, China, India, and around the world. In this video compilation, you'll gain complete access to each of the keynotes, tutorials, and technical sessions delivered at this must see event, the largest data conference in the world.

This compilation includes all three of Strata Singapore's highly curated session collections, each designed for a specific purpose: the Strata Business Summit, the Data Science and Machine Learning collection, and the Data Engineering & Architecture collection. The Business Summit is a set of sessions tailored for executives, business leaders, and strategists, where you'll learn how some of the world's leading companies build modern data strategies. Featured speakers include Teresa Tung (Accenture Labs), Grace Tang (Uber), John Akred (Silicon Valley Data Science), and Carme Artigas (Synergic Partners).

The Data Engineering & Architecture collection offers sessions on the pitfalls, tools, technologies, and design consideration surrounding the building of robust data pipelines. Featured speakers in this group include Ted Malaska (Blizzard Entertainment, maker of World of Warcraft), Jonathan Seidman (Cloudera), and Jared Lander (Lander Analytics). The Data Science and Machine Learning collection provides guidance on the techniques you can use to discover the hidden insights within your data. Kira Radinsky (eBay), Paco Nathan (O'Reilly Media), and Danielle Dean (Microsoft) are just three of the featured speakers in this session collection.

With more than 60 hours of material to view at your own pace, this video compilation of Strata Singapore 2017 is a remarkable value for any data scientist, analyst, or business executive who wants to tap into big data's best technologies and opportunities.

  • Enjoy a complete recording of Strata Singapore 2017's keynotes, tutorials, and technical sessions
  • Gain entry to the exclusive Strata Business Summit: the missing MBA for data-driven business
  • Get inspired by keynotes from Tony Lee (JD.com), Ajey Gore (GO-JEK), and Pascale Fung (HKUST)
  • Hear Rhea Liu (China Tech Insights/Tencent) describe AI driven internet trends in China
  • Join John Akred's (Silicon Valley Data Science) tutorial on how to develop a modern enterprise data strategy
  • Listen as Wolff Dobson (Google) shares the latest TensorFlow news, direct from the Google Brain team
  • Gain insight on becoming data centric from companies such as SK Telecom, GE, Dotz, and NTT
  • Explore sessions devoted to neural networks, text mining, recommenders, real-time analytics, and business forecasting
  • Take in multiple deep dive sessions on Spark streaming, Kafka, Kudu, Presto, Alluxio, Beam, BigDL, and more

Table of Contents

  1. Keynotes
    1. Computational challenges and opportunities of astronomical big data - Melanie Johnston-Hollitt (Victoria University of Wellington) 00:13:56
    2. Siri: The journey to consolidation - Mick Hollison (Cloudera), Cesar Delgado (Apple) 00:14:13
    3. Technology for humanity - Steve Leonard (SGInnovate) 00:17:06
    4. Responsible deployment of machine learning - Ben Lorica (O'Reilly Media) 00:09:30
    5. Stop the fights; embrace data (sponsored by Google) - Felipe Hoffa (Google) 00:05:36
    6. Industrial machine learning - Joshua Bloom (GE Digital) 00:18:35
    7. JD.com security intelligence and analytics: From big data to big impact - Tony Lee (JD.com) 00:17:48
    8. Sentiment and emotion-aware natural language processing - Pascale Fung (The Hong Kong University of Science and Technology) 00:17:45
    9. Freedom or safety? Giving up rights to make our roads and cities safer and smarter - Bruno Fernandez-Ruiz (Nexar) 00:18:53
    10. Impacting a Nation - Ajey Gore (GO-JEK) 00:21:21
    11. The sixth wave: Automation of decisions - Amr Awadallah (Cloudera) 00:10:08
    12. From smart cities to intelligent societies - Carme Artigas (Synergic Partners) 00:11:54
    13. Mining electronic health records and the web for drug repurposing - Kira Radinsky (eBay | Technion) 00:18:57
  2. Strata Business Summit
    1. Executive briefing: Analytics centers of excellence as a way to accelerate big data adoption by business - Carme Artigas (Synergic Partners) 00:42:10
    2. Executive Briefing: Artificial intelligence—The next digital frontier? - Shilpa Aggarwal (McKinsey & Company) 00:42:42
  3. Sponsored
    1. Painless real-time scalable serverless data pipelines: What Google Cloud can do for you (sponsored by Google Cloud) - Felipe Hoffa (Google) 00:41:48
    2. Delivering a big data analytics API with 360-degree customer profile data from multiple industry data sources (sponsored by Kinetica) - Vira Shanty (Lippo Group) 00:45:07
  4. Machine Learning
    1. Engineering cloud-native machine learning applications - Harjindersingh Mistry (Ola), Bargava Subramanian (Independent) 00:41:16
    2. The trials of machine learning at Zendesk - Wai Yau (Zendesk), Jeffrey Theobald (Zendesk) 00:31:58
    3. Extending Spark ML: Adding custom pipeline stages to Spark - Holden Karau (Google) 00:29:09
    4. Apache Spark ML and MLlib tuning and optimization: A case study on boosting the performance of ALS by 60x - Peng Meng (Intel) 00:27:46
    5. Energy monitoring with a self-taught deep network - YIQUN HU (Singapore Power) 00:38:10
    6. Payment fraud detection and prevention in the age of big data, network science, and AI - Markus Kirchberg (Wismut Labs Pte. Ltd.) 00:46:14
  5. Data engineering and architecture
    1. Data production pipelines: Legacy, practices, and innovation - Natalino Busa (DBS) 00:44:22
    2. Unsupervised fuzzy labeling using deep learning to improve anomaly detection - Adam Gibson (Skymind) 00:41:31
    3. TigerGraph: A complete high-performance graph data and analytics platform - Mingxi Wu (TigerGraph), Yu Xu (TigerGraph) 00:45:13
    4. Distributed real-time highly available stream processing - Yu-Xi Lim (Teralytics), Michal Wegrzyn (Teralytics) 00:32:45
    5. LINE's log analysis platform - Wataru Yukawa (LINE) 00:36:52
    6. Apache Kylin: Advanced tuning and best practices with KyBot - Dong Li (Kyligence), Luke Han (Kyligence) 00:40:32
    7. Top five mistakes when writing streaming applications - Ted Malaska (Blizzard Entertainment) 00:40:58
    8. Apache Spark in the hands of data scientists - Neelesh Srinivas Salian (Stitch Fix) 00:42:12
    9. High-performance enterprise data processing with Spark - Vickye Jain (ZS Associates), Raghav Sharma (ZS Associates) 00:42:40
  6. Smart cities and urban automation
    1. Leveraging live data to realize the smart cities vision - Arun Kejariwal (MZ), Francois Orsini (MZ) 00:22:53
    2. Analytics at the core of IoT ecosystems - Carme Artigas (Synergic Partners) 00:57:50
    3. Smart cities, the smart grid, the IoT, and big data - Mark Donsky (Cloudera), Syed Rafice (Cloudera) 00:28:37
  7. Data science and advanced analytics
    1. Bootstrap custom image classification using transfer learning - Danielle Dean (Microsoft), Wee Hyong Tok (Microsoft) 00:37:52
    2. Train, predict, and serve: How to put your machine learning model into production - Aki Ariga (Cloudera) 00:37:59
    3. Managing machine learning models in production - Anand Chitipothu (rorodata) 00:30:06
    4. AI within O'Reilly Media - Paco Nathan (O'Reilly Media) 00:36:02
    5. DevOps for models: How to manage millions of models in production—and at the edge - Teresa Tung (Accenture Labs), Ishmeet Grewal (Accenture Labs), Jurgen Weichenberger (Accenture Analytics) 00:36:01
    6. Fusing a deep learning platform with a big data platform - YongLiang Xu (StarHub), Masatake Iwasaki (NTT DATA Corporation) 00:28:30
    7. Aha moments in deep learning at Zendesk - Chris Hausler (Zendesk), Arwen Griffioen (Zendesk) 00:37:07
    8. Forecasting intermittent demand: Traditional smoothing approaches versus the Croston method - Prateek Nagaria (The Data Team) 00:20:49
    9. Deploying a scalable JupyterHub environment for running Jupyter notebooks - Graham Dumpleton (Red Hat) 00:40:06
  8. Big data and the cloud
    1. Rethinking data marts in the cloud: Common architectural patterns for analytics - Henry Robinson (Cloudera), Greg Rahn (Cloudera) 00:45:40
    2. Training and scoring deep neural networks in the cloud - Wee Hyong Tok (Microsoft), Danielle Dean (Microsoft) 00:40:39
    3. Big data on the rise: Views of emerging trends and predictions from real-life end users - John Mertic (The Linux Foundation), Cupid Chan (4C Decision ) 00:41:22
    4. Decoupling compute and storage with open source Alluxio - Calvin Jia (Alluxio), Haoyuan Li (Alluxio) 00:46:27
    5. Architecting a text analytics system in the cloud - Arun Veettil (Skellam AI) 00:32:32
  9. Data case studies
    1. Advanced analytics for a safe city - Clifton Phua (NCS Group) 00:22:04
    2. From physical data collection to digital delivery of results: The data journey in developing economies - Alexandre Chade (Dotz) 00:22:40
    3. Moving a smart nation: Using telco data for public transport - Zhihao Lin (Teralytics) 00:31:24
  10. Design, UX, visualization, and VR
    1. Analyzing smart cities and big data in 3D: A Geo3D journey at SmartHub - Victor Chua (StarHub Ltd) 00:36:48
    2. The art of data storytelling - Isaac Reyes (DataSeer) 00:37:11
    3. Designing AI-based conversational UIs - Mohammed Abdoolcarim (Vahan) 00:38:41
  11. Becoming a data-centric company
    1. Organizing for machine learning success - John Akred (Silicon Valley Data Science), Mark Hunter (Sainsburys Bank) 00:42:31
    2. Executive Briefing: The five dysfunctions of a data engineering team - Jesse Anderson (Big Data Institute) 00:36:09
    3. Executive Briefing: The data-driven growth engine - Jessica Chen Riolfi (TransferWise) 00:33:15
    4. Executive Briefing: How to structure, recruit, operationalize, and maintain your insights organization - Ricky Barron (InfoStrategy) 00:41:20
    5. The value of a data science center of excellence (COE) - Benjamin Wright-Jones (Microsoft), Simon Lidberg (Microsoft) 00:40:26
    6. Executive Briefing: Becoming a data-driven enterprise—A maturity model - Teresa Tung (Accenture Labs) 00:40:11
    7. Turning fails into wins - Grace Tang (Uber) 00:23:55
    8. Enabling data-driven decision making: Challenges of logical and physical scale - Sarang Anajwala (Autodesk) 00:30:55
    9. Managing successful data projects: Technology selection and team building - Jonathan Seidman (Cloudera), Ted Malaska (Blizzard Entertainment) 00:36:39
  12. Tutorials
    1. Developing a modern enterprise data strategy - John Akred (Silicon Valley Data Science) - Part 1 00:41:35
    2. Developing a modern enterprise data strategy - John Akred (Silicon Valley Data Science) - Part 2 00:47:58
    3. Developing a modern enterprise data strategy - John Akred (Silicon Valley Data Science) - Part 3 00:41:44
    4. Developing a modern enterprise data strategy - John Akred (Silicon Valley Data Science) - Part 4 00:49:12
    5. A deep dive into running big data workloads in the cloud - Vinithra Varadharajan (Cloudera), Philip Langdale (Cloudera), Jason Wang (Cloudera), Fahd Siddiqui (Cloudera) - Part 1 00:41:13
    6. A deep dive into running big data workloads in the cloud - Vinithra Varadharajan (Cloudera), Philip Langdale (Cloudera), Jason Wang (Cloudera), Fahd Siddiqui (Cloudera) - Part 2 00:40:08
    7. A deep dive into running big data workloads in the cloud - Vinithra Varadharajan (Cloudera), Philip Langdale (Cloudera), Jason Wang (Cloudera), Fahd Siddiqui (Cloudera) - Part 3 00:32:28
    8. Interactive visualization for data science - Bargava Subramanian (Independent), Amit Kapoor (narrativeVIZ Consulting) - Part 1 00:42:02
    9. Interactive visualization for data science - Bargava Subramanian (Independent), Amit Kapoor (narrativeVIZ Consulting) - Part 2 00:35:20
    10. Interactive visualization for data science - Bargava Subramanian (Independent), Amit Kapoor (narrativeVIZ Consulting) - Part 3 00:39:04
    11. Interactive visualization for data science - Bargava Subramanian (Independent), Amit Kapoor (narrativeVIZ Consulting) - Part 4 00:22:55
    12. Architecting a next-generation data platform - Jonathan Seidman (Cloudera), Ted Malaska (Blizzard Entertainment) - Part 1 00:44:23
    13. Architecting a next-generation data platform - Jonathan Seidman (Cloudera), Ted Malaska (Blizzard Entertainment) - Part 2 00:40:48
    14. Architecting a next-generation data platform - Jonathan Seidman (Cloudera), Ted Malaska (Blizzard Entertainment) - Part 3 00:51:01
    15. Architecting a next-generation data platform - Jonathan Seidman (Cloudera), Ted Malaska (Blizzard Entertainment) - Part 4 00:42:15
    16. Getting started with TensorFlow - Yufeng Guo (Google) - Part 1 00:36:22
    17. Getting started with TensorFlow - Yufeng Guo (Google) - Part 2 00:47:11
    18. Getting started with TensorFlow - Yufeng Guo (Google) - Part 3 00:47:54
    19. Getting started with TensorFlow - Yufeng Guo (Google) - Part 4 00:41:52
    20. Unraveling data with Spark using deep learning and other algorithms from machine learning - Vartika Singh (Cloudera), Jeffrey Shmain (Cloudera) - Part 1 00:50:45
    21. Unraveling data with Spark using deep learning and other algorithms from machine learning - Vartika Singh (Cloudera), Jeffrey Shmain (Cloudera) - Part 2 00:31:40
    22. Unraveling data with Spark using deep learning and other algorithms from machine learning - Vartika Singh (Cloudera), Jeffrey Shmain (Cloudera) - Part 3 00:46:58
    23. Unraveling data with Spark using deep learning and other algorithms from machine learning - Vartika Singh (Cloudera), Jeffrey Shmain (Cloudera) - Part 4 00:38:04
  13. Multiple Topics
    1. Executive Briefing: The business case for AI, Spark, and friends - John Akred (Silicon Valley Data Science) 00:35:41
    2. The stream processor as a database: Building event-driven applications with Apache Flink - Tzu-Li (Gordon) Tai (data Artisans) 00:44:10
    3. Open Budgets India: Lessons from the front line - Gaurav Godhwani (Open Budgets India, Centre for Budget and Governance Accountability) 00:37:49
    4. Privacy by design, not an afterthought: Best practices at LinkedIn - Shirshanka Das (LinkedIn), Tushar Shanbhag (LinkedIn) 00:43:45
    5. Smart agriculture: Blending IoT sensor data with visual analytics on Apache Hive and Spark - Mike Prorock (mesur.io), Hugo Sheng (Qlik) 00:22:46
    6. Streaming analytics at Grab - Andreas Hadimulyono (Grab) 00:41:22
    7. Good everywhere: Managing security and governance in a hybrid- and multicloud world - Nikki Rouda (Cloudera), Kelly Schupp (Zaloni) 00:37:44
    8. Real-world patterns for continuously deployed advanced analytics - Graham Gear (Cloudera) 00:42:56
    9. Big data: The best way to truly understand customers in Telco - kyungtaak Noh (SK Telecom) 00:28:05
    10. Practical applications for graph techniques in supply chain analysis and finance - Eric Tham (National University of Singapore) 00:45:34
    11. Spark Structured Streaming helps smart manufacturing - Xiaochang Wu (Intel) 00:41:51
    12. From Kafka to BigQuery: A guide for delivering billions of daily events - Ofir Sharony (MyHeritage) 00:37:30
    13. Driving financial inclusion in emerging markets using alternate data and big data analytics - Amit Das (Think Analytics India) 00:40:52
    14. Debugging Apache Spark - Holden Karau (Google), Joey Echeverria (Rocana) 00:45:06
    15. Big telco real-time network analytics - Yousun Jeong (SK Telecom) 00:39:44
    16. Querying time series patterns with SAX - Supreet Oberoi (Oracle) 00:42:25
    17. How to successfully run data pipelines in the cloud - Kostas Sakellis (Cloudera) 00:38:27
    18. Analytics at ING: Technology solutions to create a real-time, data-driven bank - Bas Geerdink (ING) 00:48:24