|
|
|
|
|
|
|
The total unpartitioned execution time is simply the sum of the events, T = 94 ns. We assume that each event can be segmented. |
|
|
|
|
|
|
|
|
For this study, we assume a fixed clocking overhead of 2 ns. We ignore skew effects, i.e., k = 0. If we selected a 12 + 2 = 14 ns cycle time, each of B, C, D, and E would take two cycles for execution (nine total). The clocking overhead is simply the fixed clock overhead times the number of cycles. The quantization overhead is the difference between the total partitioned instruction execution time and the sum of T and the clocking overhead. Continuing for various cycle times, we can tabulate the alternatives for total execution time. Assuming b = 0.2, we can then compute G as in study 2.1. We can compute Sopt = 13.7, but clearly any cycle time less than 14 ns (12 + 2) has excessive quantization overhead. Thus, we start with S = 9. |
|
|
|
|
| Cycle time | | | | | | | 14 ns | | | | | | | 17 ns | | | | | | | 21 ns | | | | | | | 26 ns | | | | | |
|
|
|
|
|
|
Here, 14 ns is the preferred cycle time and S = 9. |
|
|
|
|
|
|
|
|
Note, however, that if B = 30 ns and D = 30, then T = 106 and Sopt = 14.6. We would again start with cycle time = 14 ns and new S = 11. |
|
|
|
|
| Cycle time | | | | | | | 14 ns | | | | | | | 17 ns | | | | | | | 21 ns | | | | | | | 26 ns | | | | | |
|
|
|
|
|
|
Now a 17-nanosecond cycle is optimum. |
|
|
|
|
|
|
|
|
2.2.7 Wave Pipelining: The Ultimate Limit on Pipelined Processor Cycle Time |
|
|
|
|
|
|
|
|
What is the fastest cycle time, and how can one go about achieving it? |
|
|
|
|
|
|
|
|
In the foregoing discussion, cycle time is determined by the maximum delay through a pipeline segment. This time can be reduced by ensuring that the minimum pipeline segment delay is close to the maximum. |
|
|
|
|
|
|
|
|
It may seem strange that very fast cycle times can be achieved by adding delay to the minimum path through a logic segment (Pmin). Rather than resort to exotic techniques to minimize the maximum delay (Pmax) through a pipeline segment, we can, if we have good control on the minimum delay, use this delay as a sort of storage. This allows "waves" of unlatched data to proceed through the various pipeline segmentshence the term "wave" |
|
|
|
|
|