< previous page page_555 next page >

Page 555
0555-01.gif
Figure 8.26
Typical processor-memory cluster.
The processor is slowed down due to this contention. There are (1.5´1.3´5) = 9.75 transactions each 100 instructions, so there are 100/9.75 = 10.26 instructions between transactions, and each transaction takes 300 nsec.
So, the processor's effective execution rate is slowed down by:
0555-02.gif
and the effective processor execution rate is:
d87111c01013bcda00bb8640fdff6754.gif
0.93 ´ 7.73 MIPS = 7.19 MIPS.
Note that in this case, the bus or the memory as defined would equally limit processor performance.
8.10 Scalable Multiprocessors
Shared bus configurations of multiprocessors are obviously limited by bus bandwidth. At some point, the traffic required by the processors to retain a coherent picture of the memory state saturates the bus and at that point adding additional processors into such a system provides no net improvement in performance. To extend performance beyond this point requires a high bandwidth network and some mechanism to ensure memory coherency across the network. Suppose we consider the shared bus multiprocessor as a processor cluster (Figure 8.26).
Figure 8.27 shows a system description of the scalable multiprocessor. Multiprocessor systems that consist of multiple clusters connected through a general interconnect network are the basis for implementing large-scale shared memory multiprocessors. If we can find a technique for facilitating a coherency protocol across an arbitrarily large number of clusters, we call that system a scalable shared memory multiprocessor. As the difficulty in

 
< previous page page_555 next page >