Algorithms and Parallel Computing

11.11 THE RESULTING MULTITHREADED/MULTIPROCESSOR ARCHITECTURE

At this stage, we have the following:

1. We have chosen a certain affine scheduling function (Eq. 11.73),

2. We have chosen a certain projection direction (Eq. 11.96),

which produced the projection matrix (Eq. 11.107)

From all the above results, we are able to construct our reduced or projected computation domain () as shown in Fig. 11.6 for the case when I = 3, J = 4, and K = 5. Each node in the represents a task to be performed by a software thread or a PE in a systolic array at a given time step. The input data M₂(i, j) represent broadcast data coming from memory. The output data M₁(i, j) represent the output of each task that is being used as input to adjacent tasks at the next time step.

Get Algorithms and Parallel Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Algorithms and Parallel Computing by Fayez Gebali

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly