You can time out and declare a window ready after you have not seen any new events for a while, but it could still happen that some events were buffered on another machine somewhere, delayed due to a network interruption. You need to be able to handle such straggler events that arrive after the window has already been declared complete. Broadly, you have two options :
Ignore the straggler events, as they are probably a small percentage of events in normal circumstances. You can track the number of dropped events as a metric, and alert if you start dropping a significant amount of data.
Publish a correction, an updat...
Applicable to late arriving survey responses, that arrive after the survey period has closed.
Share this highlighthttp://www.safaribooksonline.com/a/designing-data-intensive-applications/16879401/