11.11 THE RESULTING MULTITHREADED/MULTIPROCESSOR ARCHITECTURE

At this stage, we have the following:

1. We have chosen a certain affine scheduling function (Eq. 11.73),

c11ue036

2. We have chosen a certain projection direction (Eq. 11.96),

c11ue037

which produced the projection matrix (Eq. 11.107)

c11ue038

From all the above results, we are able to construct our reduced or projected computation domain (c11ue039) as shown in Fig. 11.6 for the case when I = 3, J = 4, and K = 5. Each node in the c11ue040 represents a task to be performed by a software thread or a PE in a systolic array at a given time step. The input data M2(i, j) represent broadcast data coming from memory. The output data M1(i, j) represent the output of each task that is being used as input to adjacent tasks at the next time step.

Get Algorithms and Parallel Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.