|
|
|
|
|
|
|
1 FP multiply and 1 FP addition. |
|
|
|
|
|
|
|
|
Multiprocessor specifications: |
|
|
|
|
|
|
|
|
For copyback cache, assume dirty line ratio of 50%. |
|
|
|
|
|
|
|
|
For the multiprocessor shared data cache, use multiprogrammed environment = 2, Q = 100 as an approximation. The only adjustment required is for set size. |
|
|
|
|
|
|
|
|
Secondary cache implementations increase system costs by 20%. |
|
|
|
|
|
|
|
|
Additional branch adder reduces specified CPI by 0.08 for baseline processor, as computed in study 4.11. |
|
|
|
|
|
|
|
|
The processor's intended environment is scientific. |
|
|
|
|
|
|
|
|
Physical address space is 64MB. |
|
|
|
|
|
|
|
|
Each cache tag includes an additional 4-bit control field (for cache coherency and line replacement purposes). |
|
|
|
|
|
|
|
|
Two memory interface designs to choose from for all possible implementations due to pin limitations. Either 32b or 64b physical words can be utilized with 64b implementation, adding 5% to system cost. |
|
|
|
|
|
|
|
|
Let us take a moment to develop the analytical tools required to arrive at a reasonable conclusion. In this section, we develop the various design choices to be considered and analytical models to be used. In the next section, we present the data derived based on the analysis for each one of the possible implementations. |
|
|
|
|
|
|
|
|
Given the above specification and assumptions, the next step is to define what we have to work with and what choices need to be made. Since the task is to optimize performance, we start with a brief review of CPU performance. |
|
|
|
|
|