|
|
 |
|
|
|
|
(a) Will this scheme work? |
|
|
|
 |
|
|
|
|
(b) Find the new module address for each of the addresses in problem 2. |
|
|
|
|
|
|
|
|
4. A certain vector processor has a cycle time of 8 ns and memory cycle of 64 ns. It uses eight modules and does not bypass requests in the memory buffer. For a sustained vector environment of two requests per processor cycle, |
|
|
|
 |
|
|
|
|
(a) What is the requested (offered) memory bandwidth (in MBps)? |
|
|
|
 |
|
|
|
|
(b) What is the achieved memory bandwidth (in MBps)? |
|
|
|
 |
|
|
|
|
(c) What is the mean queue size of requests waiting for memory? |
|
|
|
|
|
|
|
|
5. For problem 4, can you define a bypass buffer to improve performance? Explain carefully. |
|
|
|
|
|
|
|
|
6. Suppose in problem 4 we now increase the interleaving to m = 17. Repeat problems 4 and 5. |
|
|
|
|
|
|
|
|
7. Derive the rules for finding the quotient (section 7.3.1) after division by 2k - 1. |
|
|
|
|
|
|
|
|
8. Suppose g = 0.5 can be realized with a certain buffer and m = 17. The processor parameters are the same as those in problem 4. What is the achieved memory bandwidth? |
|
|
|
|
|
|
|
|
9. For concurrent code execution, what type of dependencies arise in the following code sequence? |
|
|
|
 |
|
|
|
|
DIV.F R1, R2, R3
MPY.F R1, R4, R5
ADD.F R4, R5, R6
ADD.F R5, R4, R7
ST.F ALPHA, R5. |
|
|
|
|
|
|
|
|
10. A vector processor has four ports to memory (i.e., it can make up to four memory requests per processor cycle). The processor cycle is 8 ns, while the memory cycle is 42 ns. What is the minimum amount of interleaving required to support a bypassed buffered system so that there will be no memory contention delay? |
|
|
|
 |
|
|
|
|
If m = 64, what is gopt? |
|
|
|
 |
|
|
|
|
What will be the mean total buffer size? |
|
|
|
 |
|
|
|
|
If the achieved g = 0.2, m = 64, then what is B(m,n,g)what fraction of its maximum performance is the system able to achieve? (Ignore startup.) |
|
|
|
|
|
|
|
|
11. A four-ported (three read, one write) memory system would support a vector processor with what maximum speedup over a uniprocessor? Explain. |
|
|
|
|
|
|
|
|
12. (a) Show the precedence matrix, M1, for the code in problem 9. |
|
|
|
 |
|
|
|
|
(b) Find the full precedence matrix showing all dependencies. |
|
|
|
|
|