4.2. Data management techniques and solutions

A grid can increase application performance by way of parallelism. This implies that a big job must be divided into smaller ones. From a data point of view, it may be necessary to split the input data and to gather the results after processing. The two operations that occur respectively before and after the job submission are called data pre-processing and data post-processing. The data splitting can be triggered each time a job is submitted or it can done one time in advance. Similarly, the data gathering and joining of results can be handled multiple ways, depending in the requirements.

In the first case, the Globus Toolkit does not provide tools to perform the pre- and post-processing tasks. Therefore, ...

Get Enabling Applications for Grid Computing with Globus now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.