|
|
|
|
|
|
|
Figure 7.24
The n1/2 factor. If vector length n
>>>> s, then Sp » Sp max, but if n » s
then Sp » Sp max/2. This occurs when
the vector length n = n1/2 |
|
|
|
|
|
|
|
|
Figure 7.25
Relative vector performance vs. relative
vector length (after Hockney and
Jesshope [132]. |
|
|
|
|
|
|
|
|
chaining. If c functional units can be chained, the maximum performance is simply c * R¥ (usually, c = 2). |
|
|
|
|
|
|
|
|
Assume n is the vector size (number of elements in a vector register). The n1/2 is a parameter that measures the depth of the pipeline, or the vector startup cost (in cycles). It is the length of the vector operand or vector that achieves exactly one-half of the maximum performance. Usually, vector processors cannot start a new vector instruction until a previous vector instruction is finished using the vector pipeline. This delay in the ability to begin a new instruction corresponds to the startup time of the pipeline (see Figure 7.24). |
|
|
|
|
|
|
|
|
The Hockney and Jesshope vector efficiency is: |
|
|
|
|
|
|
|
|
Figure 7.25 plots the efficiency or relative performance vs. the relative vector length ratio. For values of vector length n significantly less than n1/2; not only is the vector efficiency low, but the processor performance (the speedup over a generic pipelined processor) is probably less than one. |
|
|
|
|
|
|
|
|
For a simple vector processor, n1/2 is approximately equal to the vector startup costthe number of cycles required to produce the first result from the vector pipeline. In a vector processor with, say, a four-stage vector |
|
|
|
|
|