< previous page page_81 next page >

Page 81
For registers using transparent clocking and with no hold time, we can use
d87111c01013bcda00bb8640fdff6754.gif
C = tg + max(td) - min(td),
where (as before) tg = setup time, and max(td) and min(td) are the maximum and minimum output delays after the clock is enabled. Again, this assumes both that the uncontrolled clock skew and the hold time are zero. The usual case for transparent latches is that the hold time is zero. For a more complete analysis of C for various latches and timing assumptions, see Klass [165] and Wong [309].
''Wave" pipelining does not shorten the longest path or the total instruction execution delay, S(Pmax + C)i. It simply increases the rate at which data is processed. Since the Pmax + C delay is unchanged, and the clock is reduced, the time at which the clock occurs must be adjusted.
Suppose a particular pipeline segment has Pmax = 10 ns and Pmin = 7 ns, with C = 2 2 nsS. Rather than waiting for Pmax + C = 10 + 2 = 12ns, we can be assured that we can bring in another pair of operandsanother wave of data7 ns before the conclusion of the first 12 ns period. This maybe done without interfering with the clocking of the first pair of data into a register. The second data will simply not arrive until the first data has been sampled properly into a storage element. With respect to this single pipeline segment, we can reduce the cycle time to Dt = Pmax - Pmin + C = 10 - 7 + 2 = 5 ns. Clearly, as we continue the analysis across multiple pipeline segments, the worst segmentthe segment with the longest cycle timemust be the one that is used for the system. Data can only be accepted and produced at the rate commensurate with the slowest segment in the pipeline. To use storage in the form of minimum delay as a factor in design to improve cycle time requires careful management of the time at which the clock occurs to sample data into registers. No longer will clock signals be arranged to arrive at exactly the same time throughout all storage elements in this system. While normally clock signals are expected to arrive at all stages at the same timewithout clock skewconstructive clock skew plays an important part in allowing the achievement of minimum cycle time.
Suppose we have a synchronously clocked machine whose state is sampled by the rising clock signal. We use CK ­ to indicate the initial relative time (t = 0) for clock action at the beginning of the pipeline. Then CSi­ indicates the amount of relative time change (skew) that the ith pipeline stage has with respect to CK ­. The clock for the ith stage must be skewed by CSi­:
0081-01.gif
where (Pmax + C)j is the sum of Pmaxj and clocking delay C. This represents the total maximum delay through the segment of the pipeline. Skewing the clock by CSi­ insures that the clock samples the data at a time equal to the maximum delay required to reach this ith segment point. This is shown as follows:
0081-02.gif

 
< previous page page_81 next page >