< previous page page_228 next page >

Page 228
Table 4.10 Static branch prediction success rates (commercial environment).
Instruction ClassBranch Instr %Guess% Correct Guess
BR unconditional(72.5)(.40)=29S29
BC conditional(72.5)(.60)=43.5U(43.5)(.59)=25.6
Loop(9.8)S(9.8)(.91)=9
Call/Return(17.7)S17.7
81.3%

the goal of branch prediction is not simply accuracy but rather processor speedup, and up to a certain point more performance is gained by guessing in-line rather than target.
In a conditional branch, more is gained by guessing and going in-line than to target. One should select a strategy that minimizes the expected branch delay, not simply maximizes the prediction accuracy. This is more fully described in study 4.6. Any successful strategy must be coupled to an appropriately designed I-buffer (both in-line and target) to avoid run-out.
For a commercial environment (Tables 3.53.10), we can achieve even better prediction rates (Table 4.10). For this environment, a simple static strategy gives over 80% correct prediction rates.
One dynamic strategy is to make predictions of the outcomes of branch instructions based on past history; that is, use the sequence of past actions of a branchwas it or was it not taken?to predict what will happen the next time it is encountered. Table 4.11 from Lee and Smith [181] shows the effectiveness of a branch prediction when prediction is based on exactly the n preceding executions of the branch in question, and whether that branch was taken or not taken. The prediction algorithm is quite simple. For example, with n = 3, if in two (or three) of the previous branch outcomes the branch was taken, it is now predicted to be taken; otherwise, it is predicted as not taken. In implementing this scheme, a small up/down counter can be associated with a cache line. The organization of such a cache is shown in Figure 4.22. If the branch is taken, the counter is incremented up to a maximum value (n). An unsuccessful branch decrements the counter. A count of zero is the same as n = 0. Of course, as mentioned earlier, prediction should be based not only on accuracy of outcome, but on minimization of expected branch delay.
The counters are referred to as saturating counters, since they remain at a maximum value on repeated increment.
If the line size is small then usually only one branch is present per line and only one counter is needed. When two branches are present, their results can be merged in the counter. This limits the effectiveness, but only in the (presumably) unlikely event of multiple branch instructions in the same line. As larger lines are used, multiple history counters can be implemented.
A number of interesting observations can be made from Table 4.11. First, the predictive accuracy very closely approaches its maximum with one or two preceding branches used for prediction. Second, the predictive accuracy for as few as two preceding branches is from 83.4 to 96.5%, which is

 
< previous page page_228 next page >