< previous page page_533 next page >

Page 533
Most of this variety results from the way the processors share memory. For example, processors may share at one of several levels:
1. Shared data cache-shared memory.
2. Separate data cache but shared bus-shared memory.
3. Separate data cache with separate buses leading to a shared memory.
4. Separate processors and separate memory modules interconnected with a multi-stage interconnection network.
We look at each of the representative designs from each of these classes throughout this chapter. Necessarily, our focus is on the simpler types of sharing arrangements.
The basic tradeoff in selecting a type of multiprocessor architecture is between resource limitations and synchronization/communications delay. Simple architectures are generally resource-limited and have rather low synchronization communications delay overhead. More robust processor-memory configurations may offer adequate resources for extensive communications among various processors in memory, but these configurations are limited by:
1. Delay through the communications network.
2. Multiple accessing of a single synchronization variable.
The simpler and more limited the multiprocessor configuration, the easier it is to provide synchronization communications and memory coherency. Each of these functions requires an access to memory. As long as memory bandwidth is adequate, these functions can be readily handled. As processor speed and the number of processors increase, eventually shared data caches and buses run out of bandwidth and become the bottleneck in the multiprocessor system. Replicating caches and/or buses to provide additional bandwidth requires management of not only the original traffic, but the coherency traffic also. From the system's point of view, one would expect to find an optimum level of sharing for each of the shared resourcesdata cache, bus, memory, etc.fostering a hierarchical view of shared-memory multiprocessing systems [289].
8.7 Multithreaded or Shared Resource Multiprocessing
The simplest and most primitive type of multiprocessor system is what is sometimes called multithreaded or what we call here shared resource multiprocessing (SRMP) [5]. In the SRMP, each of the processors consists of basically only a register setprogram counter, general registers, instruction counter, etc. The driving principle behind SRMP is to make the best use of processor silicon area. Area-intensive but perhaps infrequently used

 
< previous page page_533 next page >