Hive

Apache Hive is the data warehouse built on top of Hadoop. Hive provides an SQL-like interface for the data residing on HDFS. The queries are executed as MR, Tez, or Spark jobs on the Hadoop cluster. Hive supports indexing for fast queries along with compressed storage types like ORC. In the context of cyber security, Hive can be used for storing the aggregate views of various logs which are generated by the CI applications. 

While the batch processing frameworks like MR on Hadoop are useful in processing very large volumes of data in an efficient manner, they are not suitable for providing security to mission CIs. Such CI systems require real-time (at least near real-time) processing of the streaming or micro-batch data for quick alerts, ...

Get Artificial Intelligence for Big Data now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.