|
|
|
|
|
|
Consider now what happens when the BTB contains target addresses only: |
|
|
|
|
|
|
|
|
Again, there is no branch delay on correctly guessed branches. With the above template, it is only necessary to have addresses in the BTB. Of course, this depends on the particular timing template. |
|
|
|
|
|
|
|
|
Figure 4.25 shows the tree of possible outcomes. The expected delay for branches can be computed as shown in the following study. |
|
|
|
|
|
|
|
|
Study 4.8 Branch Target Buffer |
|
|
|
|
|
|
|
|
What is the delay due to a branch instruction? |
|
|
|
|
|
|
|
|
First, we represent the outcome tree: |
|
|
|
|
|
|
|
|
Next, we sum up the expected outcome delay: |
|
|
|
 |
|
|
|
|
(.6)(.8)(0) + (.6)(.2)(4) + (.4)(.2)(5) + (.4)(.8)(0) = 0.88 cycles. |
|
|
|
|
|
|
|
|
Note that without the BTB, if we had simply guessed in-line, we would have a significantly larger penalty. |
|
|
|
|
|
|
|
|
Assume that the in-line delay is 0, while the target delay is 5 cycles: |
|
|
|
 |
|
|
|
|
Expected delay = .44(0) + .56(5) = 2.8 cycles. |
|
|
|
|
|