7.6 LOOP UNROLLING

Loop unrolling transforms a loop into a sequence of statements. It is a parallelizing and optimizing compiler technique [29] where loop unrolling us used to eliminate loop overhead to test loop control flow such as loop index values and termination conditions. The technique was also used to expose instruction-level parallelism [20]. Consider the loop shown in Listing 7.4 [20]:

Listing 7.4 Exposing potential parallelism by loop unrolling

1: for i = 1:I do2: y(i) = y(i) + y(i − 5)3: end for

We note that the output version of the intermediate variable y(i) depends on its current value y(i) and a value that is distant 5, that is, y(i − 5). The loop can be unrolled to execute five statements in parallel as shown in Listing 7.5 [20].

Listing 7.5 Exposing potential parallelism by loop unrolling.

1: for i = 1:5:I do2: y(i) = y(i) + y(i - 5)3: y(i + 1) = y(i + 1) + y(i - 4)4: y(i + 2) = y(i + 2) + y(i - 3)5: y(i + 3) = y(i + 3) + y(i - 2)6: y(i + 4) = y(i + 4) + y(i - 1)7: end for

Now we can execute five statements of the loop at each iteration and gain a speedup ratio of 5.

Get Algorithms and Parallel Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.