Loop Optimization Example
CSCI 410 [Bono]
Note: This example from lecture is a simplified version of the Pascal
example given in the Parson's book in Section 6.1.3. We're assuming
the array is stored in row-major order, and that all the arrays are
the same size. Here we show step by step how each
optimization could be applied to this code.
Original Code
for (j = 0; j <= 30; j++) {
for (k = 0; k <= 30; k++) {
x[j][k] = y[j][k] + z[j][k];
}
}
Common-subexpression eliminations applied
The offsets for each of the arrays is the same, so we can eliminate
that common subexpression (i.e., compute the value in t2 only once,
and use it 3 times).
Computations for array indexing shown using pseudo-high-level-lang instead
of 3AC. C1 is a contant representing W*NCOLS. W is the
width of an element, and NCOLS is the number of columns in each of the arrays.
for (j = 0; j <= 30; j++) {
for (k = 0; k <= 30; k++) {
t2 = C1 * j + W * k;
x[t2] = y[t2] + z[t2];
}
}
Code-motion applied
The expression C1*j is constant in the inner loop and can
be taken out:
for (j = 0; j <= 30; j++) {
t1 = C1 * j
for (k = 0; k <= 30; k++) {
t2 = t1 + W * k;
x[t2] = y[t2] + z[t2];
}
}
Loop reduction in strength applied
We can replace mults by additions in the inner and outer loop, since
t1 and t2 are induction variables.
t1 = 0;
for (j = 0; j <= 30; j++) {
t2 = t1 + 0;
for (k = 0; k <= 30; k++) {
x[t2] = y[t2] + z[t2];
t2 = t2 + W;
}
t1 = t1 + C1;
}
The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees