|
|
|
|
|
|
|
Another decision is whether the addition of a secondary cache boosts performance by more than 20%, to overcome its higher cost of 20%. With the addition of the second-level cache, CPI computation is as follows: |
|
|
|
 |
|
|
|
|
CPI.2nd cache = |
|
|
|
 |
|
|
|
|
CPI.Base + TLB.penalty |
|
|
|
 |
|
|
|
|
+ DR/cycle * D - cache.MR * (Taccess.L2 + (L/W - 1) Tcycle.L2) |
|
|
|
 |
|
|
|
|
+ Cache.L2.MR/D - cache.MR * Tline/cycle time |
|
|
|
 |
|
|
|
|
+ I - cache.MR * IR/cycle * (Taccess.L2 + (L/W - 1)Tcycle.L2) |
|
|
|
 |
|
|
|
|
+ Cache.L2.MR/I - cache.MR * Tline/cycle time. |
|
|
|
|
|
|
|
|
The following assumptions were made in generating the preceding model: |
|
|
|
|
|
|
|
|
The secondary cache is CBWA. |
|
|
|
|
|
|
|
|
The secondary line size is 128 bytes. |
|
|
|
|
|
|
|
|
Of course, there may be cases of contention in the second-level cache, as in a unified cache, but since the on-chip cache and write assembly cache are able to filter out most of the requests, we assume that instruction references do not collide with data references in the analysis. |
|
|
|
|
|
|
|
|
A secondary cache serves as intermediate storage between the processor and main memory. It buffers the processor from traffic on the memory bus and can reduce the amount of traffic on the bus. This is especially critical in a multiprocessor system where we want to avoid bus saturation and isolate coherent traffic from the processor. Most secondary caches are managed using CBWA to reduce write traffic on the shared bus. To reduce coherent traffic interference, the first-level cache is WTNWA to keep the secondary cache consistent with the first-level cache. |
|
|
|
|
|
|
|
|
When a bus is accessed by multiple users, one can have contention among the users. A simple model is one that takes into account the occupancy of the bus by users. We follow the model presented in chapter 6. |
|
|
|
|
|
|
|
|
The previous section detailed the models and qualitative discussion of various implementation options. In this section, we present results of the analysis. Of course, only a small part of the data can be presented because of space limitations. Most of the data shown is for the 0.75m case, and we discuss results from the other feature sizes. |
|
|
|
|
|