|
|
|
|
|
|
|
Figure 6.30
Resubmitted requests. |
|
|
|
|
|
|
|
|
6.8.10 Nonblocking Caches |
|
|
|
|
|
|
|
|
In a nonblocking cache, the processor prefetches a line in anticipation of its use. If all lines can be prefetched by an amount equal to the line access time, there will be no delays and no effect on processor performance. The problem is that it is generally impossible to predict which line is going to be required far enough in advance to always avoid delay. |
|
|
|
|
|
|
|
|
There are two approaches to modeling the effects of nonblocking caches on memory contention. Both give generally similar results and (unfortunately) both are probably optimistic. Neither approach recognizes the dependencies that arise from being unable to prefetch a cache line far enough in advance of use to overlap the line access time. |
|
|
|
|
|
|
|
|
The first approach is the direct extension of our partially blocked model. Now, since Tbusy = Tline access(1 + w), |
|
|
|
|
|
|
|
|
The second approach is more sophisticated and uses an M/M/1 finite population model. |
|
|
|
|
|
|
|
|
Here, we make a number of simplifying assumptions. We imagine that the program consists basically of two independent processes: when we fetch a line for one process, we turn to another process and hopefully before the second process makes its own line access, the first line is available. The fact that there are only two tasks in this system requires us to modify our Poisson arrival distribution and form a truncated Poisson distribution. We will describe this more fully when treating multi-programmed I/O systems in chapter 9. We assume a modified Poisson arrival model, since the probability of an access in any given processor cycle is low. We assume further that the memory service time is not constant but rather has an exponential distribution (c2 = 1). This is conservative, but it recognizes that the memory busy time has a variance associated with the fact that some lines are clean and some are dirty. Using the results of finite population queueing models in chapter 9, we can represent the achieved occupancy as: |
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
where r = ts/tp;tp being the time between cache misses, and ts the mean time to process a cache miss. The above assumes that tp is greater than ts. |
|
|
|
|
|