< previous page page_152 next page >

Page 152
3. Calls are important because of the activity they represent. They cause between 5 and 10 additional instructions to be executed. These instructions, such as load/store, increment, etc., are generally independent of architecture type except where the architecture includes special provision for managing calls.
4. Among the provisions that reduce the number of instructions required to support a call is the use of load multiple/store multiple and the use of a generalized call instruction that saves/restores a number of registers. Both of these approaches reduce instruction count, but do not affect the total time it takes to manage the call, since the various load and store activities must still be accomplished.
5. Use of register windows reduces but does not eliminate the cost of the call.
6. One should expect about 60% of total execution time to be spent in the user state and 40% in the system state (including libraries), although individual programs will vary widely from this. For systems that regard library functions as part of the user state the ratio is 70% user state and 30% systems state.
Most modeling aspects of the call as it affects performance are included in Tables 3.4 and 3.5; for example, the number of load and store instructions, etc., have already been included in these tables.
One must use some discretion before drawing conclusions about the behavior of load multiple and store multiple instructions. The LDM and STM are not used exclusively with call/return, and calls involving only one or no parameters will not use these instructions. Even if we know the distribution of the number of registers involved with the load multiple and store multiple, we do not know the number of registers saved and restored in the call. The load multiple/store multiple is frequently used by the user in applications programs. The load multiple, for example, is used to load long operands into the registers or to store double-sized operands back in memory. A study by Rossmann [248] measuring an older move register allocation environment gives a conservative estimate for a 16 general-purpose register file of the mean number of operands used for load and store multiples as:
Move register allocationLDM5.989 registers
STM3.231 registers

As we have seen in Chapter 2, interprocedural register allocators can significantly reduce register traffic. We project that these allocators would both slightly reduce the number of LDM and STM instructions and significantly reduce the mean number of registers moved by LDM and STM. Thus, our projection for sophisticated register allocation for the mean number of registers moved is:
LDM3.0 registers
STM4.0 registers

As in everything else associated with the call, it should be understood that the preceding values are subject to wide variance. The preceding data [248]

 
< previous page page_152 next page >