< previous page page_477 next page >

Page 477
The difference between dataflow and improved scoreboard can be seen by some examples. Consider the following code sequences.
EXAMPLE 7.5
d87111c01013bcda00bb8640fdff6754.gif
MPY.F R1, R2, R3
ADD.F R1, R6, R7
There is an output dependency. The dataflow approach will issue the ADD.F instruction, overwriting the tag for R1 so that the R1 tag will be ADD at the end of the second cycle. The result of the MPY.F will be broadcast, but will not be ingated by any unit.
The improved scoreboard will not issue the ADD.F until the MPY.F has entered R1 (and removed the MPY tag from R1).
EXAMPLE 7.6
d87111c01013bcda00bb8640fdff6754.gif
MPY.F R1, R2, R3
ADD.F R4, R1, R6
DIV.F R1, R9, R10
This is a similar example to 7.5, except that there is an essential dependency between the MPY.F and ADD.F using R1.
Both dataflow and improved scoreboard hardware will issue the ADD.F, placing the multiplier unit tag in R1 and the adder reservation station tag in R4. The adder reservation station holds a tag for the multiplier unit.
The DIV.F is issued in dataflow. The divider unit tag overwrites the multiplier tag in R1. The DIV.F is not issued in the improved scoreboard until the multiply is complete.
EXAMPLE 7.7
d87111c01013bcda00bb8640fdff6754.gif
ADD.F R1, R2, R3
ADD.F R4, R1, R6
In this case, both strategies behave similarly. The first ADD.F is issued to the adder unit, the second to the adder reservation station. In both cases, there are separate identifiers for the adder unit and its reservation station, so that R1 will ingate the result of the adder unit and R4 will ingate the result of the adder reservation station after it completes the addition. Note that the tag for the reservation station remains with the operation even after it enters the adder unit.
It would seem that significant concurrency is lost by the improved scoreboard relative to dataflow. This may or may not be the case in practice. The real problem for the improved scoreboard is a lack of registers. With additional registers, each of the preceding differences can be overcome.
Techniques such as register renaming (to be discussed shortly) can provide these additional registers, largely eliminating the performance difference between the approaches.

 
< previous page page_477 next page >