< previous page page_263 next page >

Page 263
5. Following study 4.4 and the assumptions of problem 1, compare delayed branch and early condition code setting for each of the timing templates. (Note: TIF takes the same number of cycles as a DF.)
d87111c01013bcda00bb8640fdff6754.gif
(a) IBM 3033.
d87111c01013bcda00bb8640fdff6754.gif
(b) Amdahl V-8.
d87111c01013bcda00bb8640fdff6754.gif
(c) MIPS R2000.
d87111c01013bcda00bb8640fdff6754.gif
Treat "half-cycles" as full cycles.
6. For the statement C := A + B and the assumptions of study 4.9, find the code and timing for our R/M machine with the IBM 3033 timing template. Again use the assumptions of problem 1.
7. Describe I-buffer arrangements (number of in-line and target registers) suitable to each of the IBM 3033, Amdahl V-8, and MIPS R2000 timing templates and for w (the size of the IF path) = 4 and 8 bytes. Assume branch prediction is used.
8. A certain store buffer has a size of 4 entries. The mean number used is 2 entries.
d87111c01013bcda00bb8640fdff6754.gif
(a) Without knowing the variance, what is the probability of a "buffer full or overflow" delay?
d87111c01013bcda00bb8640fdff6754.gif
(b) Now suppose the variance is known to be s2 = 0.5; what is the probability of such a delay?
9. Determine an effective static branch prediction strategy for the following timing templates:
d87111c01013bcda00bb8640fdff6754.gif
(a) The Amdahl V-8.
d87111c01013bcda00bb8640fdff6754.gif
(b) The MIPS R2000.
d87111c01013bcda00bb8640fdff6754.gif
Treat "half-cycles" as full cycles. Follow study 4.6, then use data in study 4.7.
10. Modify the interlock of study 4.9 for an L/S architecture and MIPS R2000-type timing template.
11. Suppose the processor timing in study 4.10 was based on a machine with split I- and D-cache (cache miss of 6 cycles). Assume that all the code is initially present as a block in the I-cache, but that a store into a block in the I-cache invalidates that entry. Show the effect on the timing.
12. (a) Suppose a certain processor has the following BC behavior: A three-cycle penalty on correct guess of target, and a six-cycle penalty when it incorrectly guesses target and the code actually goes in-line. Similarly, it has a zero-cycle penalty on correct in-line guess, but a six-cycle penalty when it incorrectly guesses in-line and the target path is taken. The target path should be guessed when the probability of going to the target is known to exceed what percent?
d87111c01013bcda00bb8640fdff6754.gif
(b) For an L/S machine that has a 3-cycle cache access and an 8-byte physical word, how many words (each 8 bytes) are required for the in-line (primary) path of an I-buffer to avoid runout?

 
< previous page page_263 next page >