This section addresses the derivation of Lyon’s bit-serial multiplier  using Horner’s rule. Several other bit-serial multipliers are then derived using systolic mapping techniques.
The multiplication rule in (13.5) can be used to derive bit-serial multipliers. The architecture for a 4 × 4-bit bit-serial multiplication is shown in Fig. 13.14(a), where the is a bit-serial zero-latency scaling operator and its functionality is illustrated in Fig. 13.14(b). For a bit-serial zero-latency system, the first output bit needs to be generated in the same clock cycle as the first input bit entering the system. For the scaling operator, the first output bit a1 should be generated at the same time instance when the first input a0 enters the operator. Since input a1 has not entered the system yet, the scaling operator is a noncausal or advance operation, and cannot be implemented in hardware.