< previous page page_210 next page >

Page 210
(1) Early CC (condition) setting:
Here, the compiler rearranges code to find (and place) useful non-CC setting instructions (e.g., load, store) between the instruction that sets the CC and the branch that tests it. We evaluate the effect of n = 1, 2, and 3 intervening instructions. If the CC (condition code) is set in * - 1 (the immediately preceding instruction), n = 0 (n being the number of instructions between the CC setting instruction and the branch).
For n = 0, the BC penalty is 5.5 cycles as determined in study 4.3. The effect of n intervening instructions is simply to delay the scheduled decode time (D¢). When the branch is taken this reduces the BC penalty to the greater of 6.0-n or the unconditional branch penalty. The BC penalizes 5.0-n if the branch is untaken. There is no effect on the unconditional branch. Thus,
n = 0n = 1n = 2n = 3n = 5*
BC penalty- taken6.05.04.04.04.0
- untaken5.04.03.02.00
BC delay(cycles)2.031.881.731.651.5
* n is the number of instructions between the CC setting instruction and the branch.

The performance includes only the effects of branchconditional and un-conditional.
(ii) Delayed branch.
The delayed branch (DB) has an effect similar to the early setting of the CC. The delayed branch (at *) may immediately follow the instruction setting the CC, but the resulting instruction (target or in-line) is not decoded until * + n + 1. Other useful instructions (if available) are placed in-line and the delayed branch instruction is to be executed n instructions later (n is usually fixed at 1 or 2 but may be a parameter of the DB).
The following illustrates the delayed branch (n = 1). Assume the delayed branch (DB) is unconditional:
d87111c01013bcda00bb8640fdff6754.gif
       DB     ALPHA
       LD
ALPHA  INSTR
With proper implementation support, the DB can reduce penalties for all branch types. Again assume that we prefetch in-line, and fetch only one target instruction:
For
n = 0
n = 1
n = 2
n = 3
n = 5
DB unconditional4.0
3.0
2.0
1.0
DB conditional- taken6.0
5.0
4.0
3.0
1.0
- untaken5.0
4.0
3.0
2.0
0
DB delay(cycles)2.03
1.83
1.63
1.43
1.08

 
< previous page page_210 next page >