|
|
|
|
|
|
Figure 4.21
Branch speedup: the two
basic techniques consist of
(1) advancing the relative
time of the CC set and (2)
concurrently decoding and
computing the branch target
address. |
|
|
|
|
|
|
|
|
The most obvious approach to improving branch delay is to simply speed up: |
|
|
|
|
|
|
|
|
2. The relative time at which CC is set. |
|
|
|
|
|
|
|
|
Depending on the instruction set, the TIF time can be reduced by using a separate branch adder. This adder operates on each instruction at the same time it is being decoded to form (AG) a target instruction address (D/AG). The branch adder assumes each instruction is a branch; if it is not a branch, the result of the addition is discarded. Thus, if the instruction is a branch, the AG is completed at the same time the branch is decoded. Indeed, if the target instruction address lies in the same virtual page (has the same upper bits), then even the translate (T) step can be stripped and the TIF can begin immediately after the decode (D) cycle (Figure 4.21). Data from Chapter 3 (on centered branches) indicates that for 4KB pages, about 79% of target addresses lie in the same page. |
|
|
|
|
|
|
|
|
Knowing the state of the condition code early is always helpful in branch delay reduction. Two strategies for providing early data outcomes (study 4.4) are: |
|
|
|
|
|
|
|
|
1. Early condition code setting. |
|
|
|
|
|
|
|
|
The first approach places the instruction whose result is to be tested early in the code sequence, so that the CC is set by the time the conditional branch needs it. Instructions that do not affect the CC are used as intervening instructions. The second approach is similar, except the action of the branch is delayed by a designated number of instructions, as discussed in study 4.4. The delayed branch (DB) uses other instructions that are placed after the DB to minimize the branch delay by getting useful work done while the CC is being resolved. |
|
|
|
|
|
|
|
|
4.5.3 Branch Prediction Strategies |
|
|
|
|
|
|
|
|
Target fetch delay represents the pipeline "delay" that may occur between a taken branch instruction and its target. This change in the executed sequence of instructions causes the contents of part of the pipeline to be discarded, and the pipeline to be reloaded. This "branch problem" is closely |
|
|
|
|
|