|
|
|
| Table 10.3 Pipeline delay summary. |
| | (a) Reduced-Scale Processor |
| | (b) Super-Pipelined Processor |
| | Sequence | | Sequence | | | EX/EX | | EX/EX | | | EX/AG | | EX/AG | | | LD/EX | | LD/EX | | | EX/ST | | EX/ST | | | LD/ST | | LD/ST | |
|
|
| Table 10.4 Branch delay summary (in cycles) |
| | (a) Reduced-Scale Processor |
| | | | Branch type(assumed path) | | | | Unconditional | | | | Conditional(in-line) | | | | Conditional(target) | | | | (b) Super-Pipelined Processor |
| | | | Branch type(assumed path) | | | | Unconditional | | | | Conditional(in-line) | | | | Conditional(target) | | |
|
|
|
|
|
|
instructions. However, the number of cycles that are actually penalty cycles is not constantit is a linear function of the distance between the two dependent instructions. The greater the distance between the two instructions, the less the actual penalty, with the worst penalty being when the two instructions are sequential. Table 3.20 shows the distribution of distances between two dependent ALU operation instructions, and Table 3.19 shows the distribution of distances between an ALU operation and an address generate phase of an instruction. For simplicity, we assume that load and store instructions follow the same distribution as the two ALU operations. |
|
|
|
|
|
|
|
|
From Chapter 4, we know that the actual penalty from a given instruction is found by applying the equation: |
|
|
|
|
|
|
|
|
whereP1,2is the total penalty between the instructions over all possible dependency distances. The pipeline delay for any given instruction pair is |
|
|
|
|
|