< previous page page_320 next page >

Page 320
0320-01.gif
Figure 5.43
Effect of set associativity on CPU time per instruction.
and longer CPU cycle time. For example, if the cache size is doubled from 128KB to 256KB, the increase in CPU cycle time should be less than 3 ns in order to achieve higher machine speed. Otherwise, machine performance may actually be degraded.
Now consider the choice of set associativity.
Figure 5.43 shows three CTPI contours for different set associativities. The differences between set associativities vary quite a bit. (This is largely due to the fact that miss penalty is in terms of integral number of CPU cycles.) The choice of set associativity is not so straightforward as indicated by the miss rate figures. For example, when running at 55ns CPU cycle time, an 8KB two-way associative cache is only marginally better than an 8KB direct-mapped cache. A direct-mapped cache is preferred for its lower implementation cost. When moved to a cache size of 32KB, a four-way associative cache is only marginally better than a two-way associative cache, which in turn is clearly better than a direct-mapped cache. A two-way associative cache becomes the preferred choice.
In cases when CPU cycle time is determined by cache, the effect of increase in cycle time should be examined. Figure 5.43 shows that if any change of set associativity from direct mapping to two-way associative mapping results in an increase in cycle time of more than 4 ns, machine performance can actually be degraded. If this is the case, the simplest direct-mapped cache should be used.
Now let us consider the change of instruction template.
In the previous discussion, the same instruction template is used for different values of CPU cycle time. When the CPU cycle time is shorter than the cache access time, there is no reason to use the same template. A template based on a shorter processor cycle time should be used, requiring multiple cycles for cache access.
Suppose the cache access time (Tcache) is 65ns. When the CPU cycle time is greater than 65 ns, a shorter template is used and the CPI decreases from

 
< previous page page_320 next page >