< previous page page_499 next page >

Page 499
7.7 Comparing Vector and Multiple-Issue Processors
Comparing two processor types that are based upon rather different design premises certainly resembles comparing apples to oranges. If the goal of any processor design is to provide cost-effective computation across a range of applications, then insofar as we can understand the relative strengths and weaknesses of two approaches, we may be able to succeed in combining obvious strengths and/or avoiding weaknesses in the resultant design.
7.7.1 Cost Comparison
In principle, the cost of the execution units for both the multiple-issue machine and the vector processor ought be about the same, given that they are targeted at achieving the same maximum performance. Historically, vector processors have not stressed short execution pipelines, since performance did not depend on it. The more recent multiple-issue designs recognize the importance of short execution pipelines, as well as result-per-cycle bandwidth within each execution unit, and have stressed both. In principle, however, there ought to be little difference between either the cost or the performance of the execution unit ensembles for either processor.
A major difference lies in the storage hierarchy. Both multiple-issue machines and vector processors rely heavily on multiported registers. These registers occupy a significant amount of area. Recall from chapter 2 that the area occupied by a "dual-ported" register expressed in rbe is:
d87111c01013bcda00bb8640fdff6754.gif
Area = (number of regs + 6) (bits per reg + 6) rbe.
We can extend this model to accommodate more than two read ports and a shared write port. Let P be the number of ports (shared or nonshared). Then the area required is approximately:
d87111c01013bcda00bb8640fdff6754.gif
Area = (number of regs + 3P)(bits per reg + 3P) rbe.
Most vector processors have eight sets of 64 registers, each having 64 bits. Clearly, each vector register must be "dual-ported"implemented with a read port and a write port. Since the registers are sequentially accessed, each port can be shared by all elements in the register set.
This allows the vector registers to be serially switched among the P requesting sources. So, each vector register has area:
d87111c01013bcda00bb8640fdff6754.gif
Area = (number of element reg. + 6) (bits per reg. + 6).
but there is an additional switching overhead required to switch each of n vector registers toeach of P external ports. We estimate this area (a switch point is about 2 rbe) as
d87111c01013bcda00bb8640fdff6754.gif
Switch area = 2 (bits per reg.)P (number of vector reg.).

 
< previous page page_499 next page >