< previous page page_509 next page >

Page 509
d87111c01013bcda00bb8640fdff6754.gif
(a) Will this scheme work?
d87111c01013bcda00bb8640fdff6754.gif
(b) Find the new module address for each of the addresses in problem 2.
4. A certain vector processor has a cycle time of 8 ns and memory cycle of 64 ns. It uses eight modules and does not bypass requests in the memory buffer. For a sustained vector environment of two requests per processor cycle,
d87111c01013bcda00bb8640fdff6754.gif
(a) What is the requested (offered) memory bandwidth (in MBps)?
d87111c01013bcda00bb8640fdff6754.gif
(b) What is the achieved memory bandwidth (in MBps)?
d87111c01013bcda00bb8640fdff6754.gif
(c) What is the mean queue size of requests waiting for memory?
5. For problem 4, can you define a bypass buffer to improve performance? Explain carefully.
6. Suppose in problem 4 we now increase the interleaving to m = 17. Repeat problems 4 and 5.
7. Derive the rules for finding the quotient (section 7.3.1) after division by 2k - 1.
8. Suppose g = 0.5 can be realized with a certain buffer and m = 17. The processor parameters are the same as those in problem 4. What is the achieved memory bandwidth?
9. For concurrent code execution, what type of dependencies arise in the following code sequence?
d87111c01013bcda00bb8640fdff6754.gif
DIV.F R1,    R2, R3
MPY.F R1,    R4, R5
ADD.F R4,    R5, R6
ADD.F R5,    R4, R7
ST.F  ALPHA, R5.
10. A vector processor has four ports to memory (i.e., it can make up to four memory requests per processor cycle). The processor cycle is 8 ns, while the memory cycle is 42 ns. What is the minimum amount of interleaving required to support a bypassed buffered system so that there will be no memory contention delay?
d87111c01013bcda00bb8640fdff6754.gif
If m = 64, what is gopt?
d87111c01013bcda00bb8640fdff6754.gif
What will be the mean total buffer size?
d87111c01013bcda00bb8640fdff6754.gif
If the achieved g = 0.2, m = 64, then what is B(m,n,g)what fraction of its maximum performance is the system able to achieve? (Ignore startup.)
11. A four-ported (three read, one write) memory system would support a vector processor with what maximum speedup over a uniprocessor? Explain.
12. (a) Show the precedence matrix, M1, for the code in problem 9.
d87111c01013bcda00bb8640fdff6754.gif
(b) Find the full precedence matrix showing all dependencies.

 
< previous page page_509 next page >