|
|
|
|
|
|
|
Figure 7.14
Major data paths in a generic vector processor. |
|
|
|
|
|
|
|
|
7.3.1 The Special Case of Vector Memory |
|
|
|
|
|
|
|
|
Simple low-order interleaving is suitable for improving memory bandwidth for all reference patterns except the general vector category (in chapter 6), which is nonsequential but systematic. Such access patterns are typical of scientific applications involving matrix computations. In cases where access of elements by row is sequential and creates no interference, access by column can be disastrous if the array dimension is the same as the interleaving factor. Since the purpose of large scientific processors is to rapidly move vectors from memory into processing position, conventionally interleaved memory structures are inadequate. |
|
|
|
|
|
|
|
|
In accessing a vector in memory, the address distance (in physical words) between adjacent elements is called the stride of the access pattern. It is common for these strides to be of the form 2k, 10k, or other even dimensions. For such applications it is very useful to consider designs that remap addresses (discussed later) and designs that use a prime number of memory modules (or at least a number that is relatively prime to most expected strides). The difficulty here is in translating mod 2n addresses into addresses for the memory system. |
|
|
|
|
|
|
|
|
Interleaving Using 2k ±1 Modules |
|
|
|
|
|
|
|
|
Certain numbers have special properties, however, especially numbers of the form 2k ± 1 [67]. First, consider interleaving by 2k + 1 modules. It is |
|
|
|
|
|