|
|
|
|
|
|
|
By the same type of argument that we used earlier in this chapter, we can determine the optimal bypass factor (gopt) as: |
|
|
|
|
|
|
|
|
Since n is generally small, in principle, even a rather small bypass buffer ought to provide effective conflict-free referencing of data caches. It should be emphasized, however, that conflict-free does not necessarily mean dependence-free; thus, even in situations where the cache can provide all the required bandwidth for the processor, delays in the accessing path may create dependencies that necessarily affect processor performance. These dependencies are not predicted or measured by our models. |
|
|
|
|
|
|
|
|
The n requests may consist of buffered and unbuffered sources, as with buffered writes and unbuffered reads. In this case, |
|
|
|
 |
|
|
|
|
n = nr + nw, |
|
|
|
|
|
|
|
|
that is, n consists of nr read requests and nw write requests per memory service time. So long as d is small (at least with respect to either n components), we can treat the write buffer as linearly separate from the overall problem of optimal buffering. That is, we can find |
|
|
|
|
|
|
|
|
where dw is nw/write sources. |
|
|
|
|
|
|
|
|
Then we can use this gwopt in our bandwidth estimate B(m,n, gwopt,d). This assumes that the writes are perfectly bufferedno dependencies arise. |
|
|
|
|
|
|
|
|
Suppose we have a superscalar processor with 4-way interleaved data cache. The processor has four LD/ST units, and each makes 0.3 reads per cycle and 0.2 writes per cycle. The cache has a unit cycle time. The writes are fully buffered; the reads are not. Find the expected performance. |
|
|
|
|
|