< previous page page_493 next page >

Page 493
an amount equal to the expected incorrect guess rate. As prediction rates may vary from environment to environment, it is useful to make speculative execution controllable by the programmer.
As we will see in the next section, speculative execution can provide significant additional program speedup, but the designer must be fully aware of the additional cost and complexity it introduces into the processor.
7.6.10 Adaptive Speculation
Since correct prediction in speculative execution is essential for performance, there have been a number of studies into improved branch predictions. As we saw in chapter 4, dynamic branch prediction, even through the use of history bits or branch table buffers or a combination of the two, is generally limited to prediction rates around 90% across multiple environments. Yeh and Patt [314, 315] and others have looked at adaptive branch prediction as a method of raising prediction rates to 95%. The basic method consists of associating a shift register with each branch in, for example, a branch table buffer. The shift register records branch history. A branch twice taken and twice not taken, for example, would be recorded as ''1100." Each pattern acts as an address into an array of counters, such as the 2-bit saturating counters discussed in chapter 4. Each time the pattern 1100 was encountered, the outcome was recorded in the saturating counter. If the branch was taken, the counter was incremented; if the branch was not taken, it was decremented.
Adaptive techniques can require a good deal of support hardware. Not only must we have history bits associated with the possible branch entries, but we must have a table of counters to store outcomes. The larger the program, the more effective the large shift registers are. For small programs, it is unlikely that more than 12 history bits (i.e., an array of 212 = 4096 counters) offers any performance improvement. A number of large programs were run following the work of Yeh and Patt [315]. For these programs, the best that could be achieved for static branch prediction was 87.7%. This is termed optimum static branch prediction and is achievable only when a program is run multiple times. In this strategy, the dominant branch outcome (i.e., the complete history) is determined for each branch, and a "best guess"is determined. This can be biased (see chapter 4) by the unequal in-line and target branch penalties. Some results are shown in Figure 7.51.
An advanced adaptive strategy such as a 12-bit shift register (history bits, as shown in Figure 7.51) improves prediction rate to about 92%. The 2-bit saturating counter discussed in chapter 4 achieves 89.3% averaged over all programs. This corresponds to a history count of n = 4 in Table 4.11. While the data in Figure 7.51 is based on a different set of programs than those presented in Table 4.11, the results are generally consistentequally weighting compiler, scientific, and supervisor codes predicts an expected hit rate of 90% (n = 4).
The adaptive results are shown in two curves. The upper curve is the prediction rate averaged over all programs. The lowest curve is the prediction rate achieved by the program with the poorest prediction rate. Differences between 86% and 92% may not seem significant, but if the delay in processor execution is represented primarily by miscast branches, the difference is

 
< previous page page_493 next page >