< previous page page_417 next page >

Page 417
EXAMPLE 6.11
Consider the Intel Pentium ä processor. This processor has 8-way interleaved data cache. It can make two references per processor cycle. The caches have the same cycle time as the processor. For the Intel instruction set, the
d87111c01013bcda00bb8640fdff6754.gif
Prob (data references per instruction) = 0.6.
Note that the instruction issue mechanism is the source here (not some pipeline stage, as discussed earlier). This mechanism is modeled as having two independent requestorsone for each of the instructions that can be issuedso
d87111c01013bcda00bb8640fdff6754.gif
Prob (data reference per instruction) = d
and therefore
d87111c01013bcda00bb8640fdff6754.gif
d = 0.6.
Since the Pentium tries to execute two instructions each cycle, we have
d
=
0.6
n
=
1.2
m
=
8.

Using the d-binomial model, we get:
d87111c01013bcda00bb8640fdff6754.gif
B(m,n,d) = B(8,1.2,0.6).
The relative performance is
0417-01.gif
i.e., the processor slows down by about 4% due to contention.
6.9 Conclusions
The primary objective of modern memory systems design is capacity (or size) at a low per-bit cost; but large memory capacity necessarily implies slow access time. Even if chip access is fast, the system's overhead, including bus signal transmission, error checking, and address distribution, add significant delay; and for most available technology, these overhead delays are likely to increase relative to decreasing machine cycle times.
Faced with multiple cycle memory access time, the designer can at least provide adequate memory bandwidth to match or exceed the offered bandwidth (request rate) from the processor. There are two ways of accomplishing this:

 
< previous page page_417 next page >