< previous page page_79 next page >

Page 79
The total unpartitioned execution time is simply the sum of the events, T = 94 ns. We assume that each event can be segmented.
For this study, we assume a fixed clocking overhead of 2 ns. We ignore skew effects, i.e., k = 0. If we selected a 12 + 2 = 14 ns cycle time, each of B, C, D, and E would take two cycles for execution (nine total). The clocking overhead is simply the fixed clock overhead times the number of cycles. The quantization overhead is the difference between the total partitioned instruction execution time and the sum of T and the clocking overhead. Continuing for various cycle times, we can tabulate the alternatives for total execution time. Assuming b = 0.2, we can then compute G as in study 2.1. We can compute Sopt = 13.7, but clearly any cycle time less than 14 ns (12 + 2) has excessive quantization overhead. Thus, we start with S = 9.
Cycle time
Required # cycles
Clocking overhead
Quantization overhead
Total instr execution
G in MIPS
14 ns
9
18ns
14 ns
126 ns
27.5
17 ns
8
16 ns
26 ns
136 ns
24.5
21 ns
7
14 ns
39 ns
147 ns
21.6
26 ns
5
10 ns
26 ns
130 ns
21.4

Here, 14 ns is the preferred cycle time and S = 9.
Note, however, that if B = 30 ns and D = 30, then T = 106 and Sopt = 14.6. We would again start with cycle time = 14 ns and new S = 11.
Cycle time
Required # cycles
Clocking overhead
Quantization overhead
Total instr execution
G in MIPS
14 ns
11
22 ns
26 ns
154 ns
23.8
17 ns
8
16 ns
14 ns
136 ns
24.5
21 ns
7
14 ns
27 ns
147 ns
21.6
26 ns
7
14 ns
62 ns
182 ns
17.5

Now a 17-nanosecond cycle is optimum.
2.2.7 Wave Pipelining: The Ultimate Limit on Pipelined Processor Cycle Time
What is the fastest cycle time, and how can one go about achieving it?
In the foregoing discussion, cycle time is determined by the maximum delay through a pipeline segment. This time can be reduced by ensuring that the minimum pipeline segment delay is close to the maximum.
It may seem strange that very fast cycle times can be achieved by adding delay to the minimum path through a logic segment (Pmin). Rather than resort to exotic techniques to minimize the maximum delay (Pmax) through a pipeline segment, we can, if we have good control on the minimum delay, use this delay as a sort of storage. This allows "waves" of unlatched data to proceed through the various pipeline segmentshence the term "wave"

 
< previous page page_79 next page >