## 6.2 FOLDING TRANSFORMATION

The folding transformation provides a systematic technique for designing control circuits for hardware where several algorithm operations are time-multiplexed on a single functional unit. The derivation of the folding equation, which is the basis for this technique, is included in this section along with the derivation of the retiming for the folding equation used to retime a DFG prior to folding.

Consider the edge *e* connecting the nodes *U* and *V* with *w*(*e*) delays, as shown in Fig. 6.2(a). Let the executions of the *l*-th iteration of the nodes *U* and *V* be scheduled at the time units *Nl + u* and *Nl + v*, respectively, where *u* and *v* are the *folding orders* of the nodes *U* and *V* that satisfy 0 ≤ *u, v ≤ N* – l. The folding order of a node is the time partition to which the node is scheduled to execute in hardware. The functional units that execute the nodes *U* and *V* are denoted as *H*_{U} and *H*_{V}, respectively. Note that *N* is the number of operations folded to a single functional unit and is also referred to as the *folding factor*. If *H*_{U} is pipelined by *P*_{U} stages, then the result of the *l*-th iteration of the node *U* is available at the time unit *Nl* + *u* + *P*_{U}. Since the edge *U* *V* has *w*(*e*) delays, the result of the *l*-th iteration of the node *U* is used by the (*l* + *w*(*e*))-th iteration of the node *V*, which is executed at *N*(*l* + *w*(*e*)) + *v*. Therefore, the result must be stored for ...