|
|
|
|
|
|
Figure 4.17
Instruction prefetch. Three IF requests are made to fill the I-buffer
(PFR1 through PFR3). The first completes at the end of cycle 6 and
contains two instructions, 1A and 1B, which are decoded in cycles
7 and 8, respectively. After cycle 8, when instruction 1B is decoded, a
PFR is available and IF 4 begins. |
|
|
|
|
|
|
|
|
The important operational parameters of the instruction prefetch unit are: the instruction length distribution (Table 3.2), the width of the memory interface that supplies the unit, the width and number of PFRs, and the distribution of instruction execution sequence lengths (Figure 3.4 and Table 3.11). |
|
|
|
|
|
|
|
|
The operation of a prefetch unit is illustrated in Figure 4.17. This example assumes a memory access time of six processor cycles and three prefetch registers, each capable of holding two instructions. The processor is pipelined, and is capable of decoding one instruction per processor cycle while the prefetch buffer (PFB) is not empty. Memory interference is ignored and the memory is assumed to be capable of one request per processor cycle. |
|
|
|
|
|
|
|
|
The first instruction word fetched corresponds to the target of a branch that was taken. Since all three registers are invalid, three instruction fetches (two instructions per instruction word) are initiated on successive cycles. The instruction words begin arriving six cycles later and decoding begins as soon as the first word arrives. Since the words arrive faster than the instruction unit can decode them, after the delay of six cycles in decoding the first instruction the next five instructions are processed without delay. However, since there are only three registers, the request for the fourth instruction word cannot be initiated until the first instruction word has been processed, i.e., its two instructions have been decoded. Consequently, the seventh instruction sees a delay of two cycles, since four of the six cycles of the memory access time are masked by the decode of the second and third instruction words. This, of course, is the purpose of the PFB. Had the number of prefetch registers been four or more, the memory access time would have been completely masked for all but the first instruction. |
|
|
|
|
|
|
|
|
In order to avoid I-buffer runout, the primary path buffers (or buffer registers) must be sufficient to cover the IF access time. The width of the IB |
|
|
|
|
|