|
|
|
|
|
|
Figure 4.9
Some key modeling assumptions. |
|
|
|
|
|
|
|
|
4.3 Evaluating Pipelined Processor Performance |
|
|
|
|
|
|
|
|
In order to evaluate the execution (or interpretation) of a pipelined instruction, we assemble the various execution steps into the actions (cycles) required by a particular machine to execute a typical in-line instruction, forming the instruction timing templatefor example, |
|
|
|
|
|
|
|
|
This template describes the complete execution of a single instruction without dependencies. A simple (ALU-based) instruction is usually one that requires all of the steps of execution with a minimum number of cycles devoted to each execution step. |
|
|
|
|
|
|
|
|
The ideal relationship between successive instructions is determined by their relative decode cycles. Until the decode is complete, dependencies and the remaining instruction template cannot be established. The maximum instruction decode rate determines the peak instruction execution rate (in MIPS). |
|
|
|
|
|
|
|
|
In evaluating the effect of dependencies, the instruction causing a dependency is designated by its hypothetical memory location (*). Occasionally, multiple dependencies exist. For example, in the case of a conditional branch, the outcome and the relative timing depend both on the branch (*) and a preceding instruction that sets the tested condition. Since all instructions must be decoded, the time to decode plus the minimum time between successive decodes is the reciprocal of the maximum decode rate (assuming a maximum of one instruction decoded per cycle). This time may be measured in time (seconds) or in cycles. To this sum we simply accumulate delays for the various instruction dependencies to determine the actual time used by each of the instructions. In assessing these penalties, we usually make several assumptions (Figure 4.9). |
|
|
|
|
|
|
|
|
Pipelined Processor Design Assumptions |
|
|
|
|
|
|
|
|
1. We decode at most one instruction at a time. |
|
|
|
|
|
|
|
|
2. We do not allow out-of-order instruction execution and putaway. If the timing template has a PA cycle, then execution (EX) may complete but not PA. If the template incorporates PA into EX and shows only EX cycles, then the EX may not complete out of order. Note that instruction * + 1 does not execute before * even if * + 1 has no dependency |
|
|
|
|
|