O'Reilly logo
  • Brandon Reeser thinks this is interesting:

The amount of work required to be redone is not significant in typical failure scenarios for the MR paradigm. In essence, the MR approach provides FT at a fine-grained level, but does not have efficient query execution strategies. Hadoop MR is not well suited for interactive queries. The reasoning behind this assertion is that Hive or Hbase, which might be typically used to service such queries in a Hadoop ecosystem, do not have sophisticated caching layers that can cache results of important queries—but instead might start fresh MR jobs for each query, resulting in significant latencies. This has been documented among others by Pavlo and others (Pavlo et al, 2009). The parallel database systems are goo...