Data ingest

Ingestion is the process of bringing the data to the target system, in this case, a big data storage system such as Hadoop, Mongo, or Cassandra, or any other system that can handle that amount of data efficiently. This data can either arrive in bulk format (batch ingestion) or can be a continuous stream of data (event or stream ingestion):

  1. Batch ingestion: Typically used when you want to ingest data from one source to another, for example, when you want to bring your CRM data into Hadoop. Typically, what you would do is do a data dump from your relational database and load this data dump into your big data storage platform. Hence, you are doing a bulk or batch ingest of data.
  2. Stream ingestion: When you have a continuous source ...

Get Architecting Data-Intensive Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.