< previous page page_396 next page >

Page 396
binomial model. If the particular requestor has a diminishingly small probability of making a request during a particular service interval (as in the case of, say, I/O) we use the Poisson arrival model.
The second dimension is the variance on the service distribution, measured by the coefficient of variance (c2). In cases where the service time is fixed and unvariant, we use a constant service distribution (D), since the coefficient of variance c2 equals zero. Frequently, the service time is known to vary, but the variance is unknown. For ease of analysis frequently one chooses c2 = 1, which represents an exponential service time. Of course, if the actual variance is known, the c2 can be computed and we can use the M/G/1 queueing model or a variation of it.
The third parameter determining the simple queueing model is the amount of buffering available to the requestor to hold pending requests (allowing bypassing if there are no requests while any pending requests are delayed). This bypassing, or buffering, factor is dealt with when we take up more complex processor models in chapter 7, or when we deal with complex I/O storage arrangements in chapter 9.
6.8 Processors with Cache
Almost all modern processors use caches to access both instructions and data. The only notable exceptions to this are the vector processors, which access data directly from memory. We will discuss vector processors further in the next chapter.
Conventional processors that access all of their instructions and data from cache require some discussion. Generally, processors that access noninterleaved caches ought to have either no or limited performance degradation due to memory systems contention. This does not mean that the memory system delay does not affect performance; indeed, the simpler the cache and the longer the cache miss penalty, the poorer the overall performance. However, in these cases the performance is quite predictable. The processor stops and waits until the memory system returns an entire line to the processor, then the processor resumes. Requests do not cause contention to memory, so only the predictable memory accessing delay remains. As discussed in chapter 5, there are three types of cache memory interactions:
1. Fully blocked. When the cache misses, the processor completely stops processing until the entire line is returned to the cache, then processing resumes.
2. Partially blocked. The processor resumes processing after some portion of the line is returned; thus, there is a period when both processor and memory are busy.
3. Nonblocked. A nonblocking cache allows multiple pending misses. Thus, the processor does not stop when the miss occurs unless required by a data dependency. Nonblocking caches that allow up to d misses before blocking are called nonblocking caches of degree d [267]. Prefetching of lines is associated with nonblocking caches, as it improves their anticipation of use. Some processors support prefetching with special instructions, e.g., touch.

 
< previous page page_396 next page >