|
|
|
|
|
|
|
executed corresponds to the dynamic count of high-level language operations that are expected to be executed during the course of a program. Architectures can then be compared relative to a hundred HLL instructions, and distribution of types of instructions and instruction profiles can be developed. |
|
|
|
|
|
|
|
|
An instruction set affects the performance of any processor implementation by determining the amount of dynamic activity that the processor must complete in order to execute a program. A more robust instruction set with a large set of formats decreases the number of instructions that must be executed for a particular program. More complete encoding of individual instructions (using register modes, etc.) reduces the size of individual instructions and may secondarily affect the dynamic instruction count. A reduced dynamic instruction count, together with smaller-sized instructions, reduces the instruction bandwidth requirements for memory for a processor to execute a program. All of these contribute positively to processor performance. Offsetting this is the increased number of cycles for instruction execution and the increased decoder complexity required to decode multiple format types and highly encoded portions of the instruction. Such decoder complexity may in fact affect the processor cycle. |
|
|
|
|
|
|
|
|
With advances in register allocation technology, general-purpose register set architectures have become the dominant instruction set. Registers provide faster access than cache or main memory. Registers reduce the data traffic to both cache (where present) and memory. While even a few registers provide a significant value to the processor, increasing register size beyond a certain point provides only a marginal increase in performance. In fact, in certain situations, large register sets may actually decrease performance. This could arise either because the register set access increases cycle time or because frequently occurring interrupts require the transfer and restoration of register contents to and from memory. |
|
|
|
|
|
|
|
|
2.8 Some Areas for Further Research |
|
|
|
|
|
|
|
|
The wave pipelining concepts have only recently been applied to microprocessors and on-chip technology as a method of speeding up cycle time. A good deal of work remains to be done, especially in developing circuits that better support the concepts of wave pipeliningwith stable and predictable delay. Predictable delay is especially a problem with CMOS technology, and the applicability and use of wave pipelining in large implementations remains a challenge. |
|
|
|
|
|
|
|
|
The categorization of area on a functional (rbe) rather than a geometric basis currently exists only for storage elements. Units such as decoders, ALUs, etc., have received little attention in area modeling, yet it is impossible to do performance optimization without similar functional area models. Indeed, an understanding of minimum cycle time and functional modeling of area are basic ingredients that must be present in any formalized model of microprocessor optimization. |
|
|
|
|
|
|
|
|
With the abundant literature on instruction set analysis, it might seem that this is a time-worn area that is well-understood; but when the field can move from the breakthrough of a highly encoded Intel iAPX 432 in 1981 to a breakthrough of a very loosely encoded RISC-1 in 1985, it is clear that |
|
|
|
|
|