< previous page page_225 next page >

Page 225
Table 4.7 summarizes the major aspects of each of these approaches. In the following sections we will look at the details of each approach.
4.5.1 Branch Elimination
Sequences of code that include small basic blocks (i.e., one or two instructions between branches) can be very disruptive to pipelined instruction execution. Referring back to chapter 3 (Figure 3.4), we can see that about 8% of all basic blocks consist of a single instruction. In these cases we frequently can eliminate the initial branch and make the execution (or PA) of the subsequent instruction conditional on a certain condition code.
For example, suppose we had:
d87111c01013bcda00bb8640fdff6754.gif
OP 
BC CC = Z, 
* + 2
ADD R3,R2,R1
BC CC, ALPHA
This could be replaced by:
d87111c01013bcda00bb8640fdff6754.gif
OP 
ADD R3,R2,R1,NZ
BC CC, ALPHA
That is, if the CC = Z (zero), the ADD instruction is not executed.
This elimination of the first BC requires that other operations such as the ADD carry a condition code specifier. The result of the ADD is actually stored (PA) only if the machine CC corresponds to the CC in the instruction. Such instructions are called conditional instructions.
In the preceding example, presumably almost all instructions could be conditional. The number of instructions that can be conditional is usually limited. One alternative is to have a single select instruction which chooses between two register values depending on a condition code. This approach has been referred to as guarding the instruction [134].
Since all of these approaches require knowledge of the CC before instruction PA, they are still subject to delay, depending on the timing template. Indeed, for machines that require the PA to be in order, these techniques have limited value. For machines that allow out-of-order execution, however, these techniques can be helpful.
4.5.2 Branch Speedup
For a simple processor, the branch delay (Figure 4.18) is:
d87111c01013bcda00bb8640fdff6754.gif
Branch delay = max {time for TIF, time for CC set}.

 
< previous page page_225 next page >