page_251

page_251

next page >

Page 251

Table 4.18 Estimated arithmetic delays (EX) in some current microprocessors.

E_i

IBM RS/6000

HP PA-RISC 1.1

MIPS R4000*

E₀

ADD/SUB

MPY

DIV

ADD.F

MPY.F

DIV.F

* in half cycles. Run-on delay =E_i - E₀ If E_i - E₀ < 0, then the run-on delay is 0.



		The total run-on effect is simply:



		where w_i is the fraction of occurrence of instruction type i and E_i is its number of EX cycles. If (E_i - E₀) is negative, then it is treated as a zero entry in the run-on delay summation.



		The following study illustrates what can happen to machine performance when stores occur into locations that are close to the current area of instruction execution. Machines tend to make the worst possible assumptions (to err on the side of safety) about the behavior of instructions in execution.



		4.8.1 Store in Instruction Stream Delay



		This delay occurs when a store address falls within the address range of the instructions already in the pipeline and instruction fetch buffer; these instructions are discarded, and instruction fetching is reinitiated when the store is completed, study 4.10 illustrates the resulting problem.



		Fortunately, self-modifying programs have been recognized as being undesirable for other reasons and are relatively rare; but even if the instruction code is not self-modifying, it may appear to be so if it is intermixed with data variables.



		Study 4.10 Delays Due to Apparent Stores into the Instruction Stream



		This study uses a simple timing template similar to the ones used earlier in the chapter. Since only load, store, and branch

page_251

next page >