ParallelMerge

The example in this section (Example 11-8) is more complex and requires a little familiarity with the Standard Template Library (STL) to fully understand. It shows the power of parallel_for beyond flat iteration spaces. The code performs a parallel merge of two sorted sequences. It works for any sequence with a random-access iterator. The algorithm operates recursively as follows:

  1. If the sequences are too short for effective use of parallelism, it does a sequential merge. Otherwise, it performs steps 2–6.

  2. It swaps the sequences if necessary so that the first sequence, [begin1, end1), is at least as long as the second sequence, [begin2, end2).

  3. It sets m1 to the middle position in [begin1, end1). It calls the item at that location key.

  4. It sets m2 to where key would fall in [begin2, end2).

  5. It merges [begin1,m1) and [begin2,m2) to create the first part of the merged sequence.

  6. It merges [m1,end1) and [m2,end2) to create the second part of the merged sequence.

The Intel Threading Building Blocks implementation of this algorithm uses the Range object to perform most of the steps. The predicate is_divisible performs the test in step 1, along with step 2. The splitting constructor performs steps 3–6. The body object does the sequential merges.

Example 11-8. Parallel merge

#include "tbb/parallel_for.h" #include <algorithm> using namespace tbb; template<typename Iterator> struct ParallelMergeRange { static size_t grainsize; Iterator begin1, end1; // [begin1,end1) is 1st sequence to be ...

Get Intel Threading Building Blocks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.