< previous page page_262 next page >

Page 262
d87111c01013bcda00bb8640fdff6754.gif
H. S. Stone. High-Performance Computer Architecture, 2nd edition. Electrical and Computer Engineering. Addison-Wesley, Reading, MA, 1990.
Branch Performance
d87111c01013bcda00bb8640fdff6754.gif
J. A. DeRosa and H. M. Levy. An evaluation of branch architectures. Proceedings of the 14th Annual Symposium on Computer Architecture, pages 1016, June 1987.
d87111c01013bcda00bb8640fdff6754.gif
J. E. Smith. A study of branch prediction strategies. Proceedings of the 8th Annual Symposium on Computer Architecture, pages 135148, May 1981.
4.13 Problem Set
1. Evaluate the performance (cycles per instruction) of processors based on each of the three templates (IBM 3033, Amdahl V-8, and MIPS R2000, assuming the same ALU operations) described earlier in the chapter. Assume all have the same (unit) cycle time, with the relative decode rate as shown in Section 4.2.1. Treat the ''half-cycles" as full cycles. Also assume that the CC is set at the end of the last EX cycle. Follow the assumptions of study 4.3.
2. Repeat study 4.2 for the R/M machine used in that study, but with the following code sequence:
d87111c01013bcda00bb8640fdff6754.gif
LD     R3,           1000[R1,R2]
ADD    R3,           1008[R1,R2]
MPY    R4,           2000[R1,R3]
ST     3000[R1,R3], R4
BC.NE
* TARGET[R1,R3]
*Assume the ADD is the only instruction in this sequence that sets the CC.
d87111c01013bcda00bb8640fdff6754.gif
Follow all other assumptions made in study 4.2.
3. Now repeat study 4.3, ignoring "run-on" effects, using design target data (scientific environment) from Chapter 3 for branch and address dependencies for the timing templates for:
d87111c01013bcda00bb8640fdff6754.gif
(a) The IBM 3033.
d87111c01013bcda00bb8640fdff6754.gif
(b) The Amdahl V-8.
d87111c01013bcda00bb8640fdff6754.gif
(c) The MIPS R2000.
d87111c01013bcda00bb8640fdff6754.gif
Treat "half-cycles" as full cycles.
4. Repeat study 4.3 using design target data (scientific environment) from Chapter 3. Assume that the only run-on instructions are the variable field length instructions (plus LDM and STM). The R/M architecture described includes these MM instructions. Calculate run-on effects as one EX cycle for each byte in the larger of the source operands. Note that the MM instructions necessarily have an extended timing template (for multiple AG, DF, etc.) and this must be included.

 
< previous page page_262 next page >