Parallel_reduce with partitioner

Parallel_reduce has an optional third argument to specify a partitioner. See the section “Automatic grain size” for more information.

Example 3-15 extends Example 3-10 and Example 3-11 by using an auto_partitioner.

Example 3-15. Parallel sum with partitioner

#include "tbb/parallel_reduce.h"
#include "tbb/blocked_range.h"

using namespace tbb;

struct Sum {
    float value;
    Sum() : value(0) {}
    Sum( Sum& s, split ) {value = 0;}
    void operator()( const blocked_range<float*>& range ) {
        float temp = value;
        for( float* a=range.begin(); a!=range.end(); ++a ) {
            temp += *a;
        }
        value = temp;
    }
    void join( Sum& rhs ) {value += rhs.value;}
};
float ParallelSum( float array[], size_t n ) {
    Sum total;
    parallel_reduce( blocked_range<float*>( array, array+n ),
                     total, auto_partitioner() );
    return total.value;
}

Two important changes from Example 3-11 should be noted:

  • The call to parallel_reduce takes a third argument, an auto_partitioner object.

  • The blocked_range constructor is not provided with a grainsize parameter.

Get Intel Threading Building Blocks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.