Appendix 5 Loop Unroll Degree Minimization: Experimental Results

All our benchmarks have been cross-compiled on a regular Dell workstation, equipped with Intel(R) Core(TM)2 CPU of 2.4 GHz and Linux operating system (kernel version 2.6, 64 bits).

A5.1. Stand-alone experiments with single register types

This section presents full experiments on a stand-alone tool by considering a single register type only. Our stand-alone tool is independent of the compiler and processor architecture. We will demonstrate the efficiency of our loop minimization method for both unscheduled loops (as studied in section 11.4) and scheduled loops (as studied in section 11.6).

A5.1.1. Experiments with unscheduled loops

In this context, our stand-alone tool takes a data dependence graph (DDG) as input, just after a periodic register allocation done by SIRA, and applies a loop unrolling minimization (LUM).

A5.1.2. Results on randomly generated data dependence graphs

First, our stand-alone software generates the number of distinct reuse circuits k and their weights (μ₁, …, μ_k). Afterwards, we calculate the number of remaining registers and the loop unrolling degree ρ = lcm(μ₁, …, μ_k). Finally, we apply our method for minimizing ρ.

We did extensive random generations on many configurations: we varied the number of available registers from 4 to 256, and we considered 10,000 random instances containing multiple ...

Get Advanced Backend Optimization now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Advanced Backend Optimization by

Appendix 5

Loop Unroll Degree Minimization: Experimental Results

A5.1. Stand-alone experiments with single register types

A5.1.1. Experiments with unscheduled loops

A5.1.2. Results on randomly generated data dependence graphs

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly