|
|
|
|
|
|
|
Study 8.1 SRMP vs. Pipelined Processor |
|
|
|
|
|
|
|
|
In this study, we contrast a conventional pipelined processor (similar to our baseline) with a four-processor SRMP occupying roughly the same chip area. |
|
|
|
|
|
|
|
|
Suppose an L/S pipelined processor has a 16KB I-cache and an 8KB D-cache, both set associative, CBWA and LRU replacement. The caches have a 16B line and miss delay of eight cycles. The processor makes one I-refr/I and 0.5 D-refr/I. The processor itself has performance of 1.5 CPI without cache misses (i.e., one CPI for decode and 0.5 CPI for branch, run-on, and other effects). We contrast the piplined processor with a four-processor SRMP. Each processor has its own register set and I-cache (4KB direct mapped). The SRMP shares D-cache, decoder, floating point ALU, etc. Once a processor is stalled (cache miss, etc.), it immediately switches on the next cycleto the next available processor. The SRMP D-cache is designed to allow it to "non-block" on a miss; i.e., the miss is processed concurrently with accesses for another processor (unless, of course, it is to the missed line). |
|
|
|
|
|
|
|
|
Pipelined Processor Analysis |
|
|
|
|
|
|
|
|
The additional CPI lost due to cache misses (using chapter 4 data) is computed as follows: |
|
|
|
|
| | |
|
|
|
|
I-cache miss rate ´ I-refr/I ´ miss penalty |
|
|
|
| | | |
|
|
|
|
[0.05 ´ 1.04] ´ 1 ´ 8 cycles |
|
|
|
| | | | | | |
|
|
|
|
D-cache miss rate ´ D-refr/I ´ miss penalty |
|
|
|
| | | |
|
|
|
|
[0.08 ´ 1.04] ´ 0.5 ´ 8 cycles |
|
|
|
| | | | |
|
|
|
|
Pipelined processor CPI total = 2.25. |
|
|
|
|
|
|
|
|
|
|
Now each processor has its own I-cache: 4KB direct mapped. They share the D-cache. This ensures cache consistency and simplifies the I-cache design. |
|
|
|
|
| | |
|
|
|
|
[.095 ´ 1.29] ´ 1 ´ 8 cycles |
|
|
|
| | | |
|
|
|
|
|
|
The D-cache has data for four processors resident. We approximate this situation by using MP = 3 (warm start) and Q = 100. |
|
|
|
|
| | |
|
|
|
|
[0.26 ´ 1.04] ´ 0.5 ´ 8 cycles |
|
|
|
| | | | |
|
|
|
|
Total CPI for single SRMP processor = 3.56 CPI. |
|
|
|
|
|
|
|