Chapter 7

Large-Scale Data-Intensive Computing

Mark Parsons

EPCC, The University of Edinburgh, Edinburgh, United Kingdom

7.1 DIGITAL DATA: CHALLENGE AND OPPORTUNITY

7.1.1 The Challenge

When historians come to write the history of the early part of the 21st century, they will almost certainly describe this period as one of the enormous transformations in the way human beings interact with each other and the world around them and how they store the information arising from those interactions.

The scientific community has managed large digital data sets, in particular in the particle physics and astronomy domains, for more than 30 years. Once the preserve of the scientific domain, over the past two decades, we have witnessed the steady digitization of information throughout our daily lives. From digital photography to bar coding in shops to our electronic tax return to mobile phones, we are surrounded by data derived from digital devices. This revolution has happened surprisingly quickly and is almost certainly still in its infancy. Today, we are generating more stored data in each year than in all of the preceding years combined. We are already struggling to deal with this data deluge and, as data volumes continue to double, this problem can only get more challenging. The challenges we face include

  • managing the increasing volume and complexity of primary data and the increasing rate of data derived from that primary data,
  • coping with the rapid growth in users of that data who want to ...

Get Large-Scale Computing Techniques for Complex System Simulations now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.