`Parallel_reduce`

is an extension of the parallel ranges used in earlier examples, but it adds the complexity of combining results and eventually reducing them back to a single answer. Studying the examples in this section should make you very comfortable with `parallel_reduce`

(Chapter 3).

Example 11-19 sums the values in an array.

Example 11-19. ParallelSum

#include "tbb/parallel_reduce.h" #include "tbb/blocked_range.h" using namespace tbb; struct Sum { float value; Sum() : value(0) {} Sum( Sum& s, split ) {value = 0;} void operator()( const blocked_range<float*>& range ) { float temp = value; for( float* a=range.begin(); a!=range.end(); ++a ) { temp += *a; } value = temp; } void join( Sum& rhs ) {value += rhs.value;} }; float ParallelSum( float array[], size_t n ) { Sum total; parallel_reduce( blocked_range<float*>( array, array+n, 1000 ), total ); return total.value; }

This example is easily converted to do a reduction for any associative operation `op`

as follows:

Replace occurrences of 0 with the identity element for

`op`

.Replace occurrences of += with

`op=`

or its logical equivalent.Change the name

`Sum`

to something more appropriate for`op`

.

The operation is allowed to be noncommutative. For example, `op`

could be matrix multiplication.

Example 11-20 does away with the need to supply a grain size by converting the prior example to use an `auto_partitioner`

. Note how the `block_range`

loses the `grainsize`

parameter, ...

Start Free Trial

No credit card required