O'Reilly logo
  • sudhir patil thinks this is interesting:

  • Processing out-of-order data based on application timestamps (also called event time)

  • Maintaining large amounts of state

  • Supporting high-data throughput

  • Processing each event exactly once despite machine failures

  • Handling load imbalance and stragglers

  • Responding to events at low latency

  • Joining with external data in other storage systems

  • Determining how to update output sinks as new events arrive

  • Writing data transactionally to output systems

  • Updating your application’s business logic at runtime

  • From

    Cover of Spark: The Definitive Guide


    Challenges with streaming system