Alpha phase is bringing data from the Data Lake into an application cluster

Depending on the source, we have to use the right GCP service. We can also use Apache Flume/Apache Storm/Apache Kafka/Apache Beam for streaming data - services from Hadoop ecosystem. Interesting things is that Apache Beam is an alternative to Cloud Data flow as well.

To ingest data from Cloud SQL, we can use the Sqoop component of Hadoop. And for batches we have plenty of options - Apache Flume can be one of those as well.

Get Cloud Analytics with Google Cloud Platform now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.