9.7 DESIGN 3: PIPELINED INPUT AND OUTPUT
A possible attractive implementation would be when both the input and output of each PE are stored in a register. This implies a fully pipelined design, which is potentially the fastest design possible. Assume without loss of generality that N is even. We can write Eq. 9.7 as
(9.18)
We perform an iteration on the inputs X in the above equation:
(9.19)
(9.20)
and the output is given by
(9.21)
The above equation can be written as the iteration
(9.22)
(9.23)
(9.24)
(9.25)
Figure 9.4a shows the resulting DAG for an output sample, y. The figure can be replicated to show the different DAGs for other output samples. This is ...
Get Algorithms and Parallel Computing now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.