Operand forwarding (or data forwarding) is an optimization in pipelined CPUs to limit performance deficits which occur due to pipeline stalls.[1] [2] A data hazard can lead to a pipeline stall when the current operation has to wait for the results of an earlier operation which has not yet finished.
ADD A B C #A=B+C SUB D C A #D=C-A
If these two assembly pseudocode instructions run in a pipeline, after fetching and decoding the second instruction, the pipeline stalls, waiting until the result of the addition is written and read.
2 | 3 | 4 | 5 | 6 | 7 | 8 | ||
Fetch ADD | Decode ADD | Read Operands ADD | Execute ADD | Write result | ||||
Fetch SUB | Decode SUB | stall | stall | Read Operands SUB | Execute SUB | Write result |
2 | 3 | 4 | 5 | 6 | 7 | ||
Fetch ADD | Decode ADD | Read Operands ADD | Execute ADD | Write result | |||
Fetch SUB | Decode SUB | stall | Read Operands SUB: use result from previous operation | Execute SUB | Write result |
In some cases all stalls from such read-after-write data hazards can be completely eliminated by operand forwarding:[3] [4] [5]
2 | 3 | 4 | 5 | 6 | ||
Fetch ADD | Decode ADD | Read Operands ADD | Execute ADD | Write result | ||
Fetch SUB | Decode SUB | Read Operands SUB: use result from previous operation | Execute SUB | Write result |
The CPU control unit must implement logic to detect dependencies where operand forwarding makes sense. A multiplexer can then be used to select the proper register or flip-flop to read the operand from.