9.7 DESIGN 3: PIPELINED INPUT AND OUTPUT

A possible attractive implementation would be when both the input and output of each PE are stored in a register. This implies a fully pipelined design, which is potentially the fastest design possible. Assume without loss of generality that N is even. We can write Eq. 9.7 as

(9.18) c09e018

We perform an iteration on the inputs X in the above equation:

(9.19) c09e019

(9.20) c09e020

and the output is given by

(9.21) c09e021

The above equation can be written as the iteration

(9.22) c09e022

(9.23) c09e023

(9.24) c09e024

(9.25) c09e025

Figure 9.4a shows the resulting DAG for an output sample, y. The figure can be replicated to show the different DAGs for other output samples. This is ...

Get Algorithms and Parallel Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.