|
 |
|
|
|
|
What is the excess CPI due to cache misses? |
|
|
|
|
|
|
|
|
10. Suppose that the cache outlined in problem 1 replaces the cache in study 5.3. Find the new effective performance (split cache and integrated cache). |
|
|
|
|
|
|
|
|
11. A certain processor produces a 32-bit virtual address. Its address space is segmented (each segment is 1 megabyte maximum) and paged (512-byte pages). The physical word transferred to/from cache is 4 bytes. |
|
|
|
 |
|
|
|
|
A TLB is to be used, organized set associative, 128 ´ 2. If the address bits are labeled V0V31 for virtual address and R0R31 for real address, least to most significant: |
|
|
|
 |
|
|
|
|
(a) Which bits are unaffected by translation (i.e., Vi = Ri)? |
|
|
|
 |
|
|
|
|
(b) If the TLB is addressed by the low-order bits of the portion of the address to be translated (i.e., no hashing), which bits are used to address the TLB? |
|
|
|
 |
|
|
|
|
(c) Which virtual bits are compared to virtual entries in the TLB to determine whether a TLB hit has occurred? |
|
|
|
 |
|
|
|
|
(d) As a minimum, which real address bits does the TLB provide? |
|
|
|
|
|
|
|
|
12. The translated address is now used to access a unified 32KB cache (CBWA) with 16-byte lines and four-way set associativity (L/S, user only, no system effects, LRU, no I/O). |
|
|
|
 |
|
|
|
|
(a) What is the expected miss rate? |
|
|
|
 |
|
|
|
|
(b) Suppose the cache described earlier was determined to have a (read) miss rate of 2% and an effective miss time of 5 cycles. Assume an L/S processor (scientific environment) with |
|
|
|
 |
|
|
|
|
as the basic timing templates and 4B fetched per IF. An instruction is decoded each cycle. Assume neither delayed branch nor bypassing and that the CC is set at the end of the last EX cycle. Use BR=0.05 and BC = 0.15; always guess in-line on BC (BC actually goes to target 50% of the time). The processor creates how many I references/instruction? |
|
|
|
 |
|
|
|
|
For the same processor, assume the data referencing causes 0.33 D-read refr/I and 0.235 D-write refr/I. How many cycles per instruction do cache misses add to the processor execution time? |
|
|
|
|
|
|
|
|
13. Suppose the microprocessor on-chip area is sufficient for only a 12 KB (32B line) integrated cache. |
|
|
|
 |
|
|
|
|
(a) Show how a 26-bit address would be partitioned in tag, index, and line address using a 3-way set associative cache. Show directory entry. |
|
|
|
 |
|
|
|
|
(b) What is the effective miss rate for (a)? |
|
|
|
 |
|
|
|
|
(c) Suppose we now want to create a 12 KB (32B line) direct-mapped cache. Show address partitioning and directory entry using assumptions of (a). |
|
|
|
 |
|
|
|
|
(d) Find the miss rate for (c). |
|
|
|
|
|