O'Reilly logo

Accumulo by Billie Rinaldi, Aaron Cordova, Michael Wall

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Chapter 9. Advanced Table Designs

After covering the basics of table design in the previous chapter, here we discuss advanced design considerations for storing some commonly encountered types of data in Accumulo. Examples include time series, graph, geospatial, feature vector, and other data.

Time-Ordered Data

Reading and writing data in time order is a common requirement. In a previous example, we ordered email messages in reverse time order within a particular folder belonging to a particular user account. Some applications want to access data primarily in time order. That is, the first and most important element of the data is the time component. Examples include time series such as stock data, application logs, and series of events captured by sensors.

We could simply use a timestamp as the row ID of a table. Rows will be sorted in increasing time order, and retrieving the data for one timestamp or a range of timestamps is straightforward.

But using a simple timestamp as the row ID of a table can be problematic when it comes to writing the data. This is because often new data arrives with timestamps that only ever increase. If we simply order our data this way, all new data will always be written to the end of the table, specifically to the last tablet, which spans some timestamp we’ve already seen up to positive infinity (Figure 9-1).

Hotspot in time-ordered table
Figure 9-1. Hotspot in time-ordered ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required