Continuing the naive implementation, let's measure its performance with a naive performance evaluation compared to std::transform() at a single CPU core.
We measure two scenarios:
- Process 32 elements with an expensive function called heavy_f
- Process 100,000,000 elements with an inexpensive function called light_f
The following code processes a low number of elements with the expensive heavy_f transform function:
// Low number of elements - heavy transform function auto heavy_f = [](float v) { auto sum = v; for (size_t i = 0; i < 100'000'000; ++i) { sum += (i*i*i*sum); } return sum; }; auto measure_heavy() { auto n = 32; auto src = std::vector<float>(n); auto dst = std::vector<float>(n); std::transform(src.begin(), ...