< previous page page_129 next page >

Page 129
Multiple register sets can be organized as register windows with a degree of overlap between register sets, an analog to the run-time stack organization. The overlap allows space for arguments to be passed from one routine to another, and the multiplicity of sets avoids the necessity of having to save and restore registers upon entering or exiting a routine (Figure 2.50 a). The net effect of multiple register sets (assuming sufficient size) is twofold:
1. Reduction of data memory traffic.
2. Depending on the architecture, it may reduce the required instruction traffic by eliminating load and store instructions associated with item 1.
Of course, multiple register sets (MRS) have a limited capacity; they consist of a fixed number of register sets. MRS can be parameterized by the number of registers privately available (the locals), by the number of register words shared among procedures (the overlap), and finally by the number of windows in the multiple register set. Figure 2.50(b) shows the arrangement used in several SPARC implementations; each window has eight local registers. The processor addresses 32 registers at any one time, eight of which are reserved for global variables and eight of which are shared, each with input and output procedures. Thus, 16 new registers are added for each additional window in the MRS ensemble. A SPARC-type window can be designated as (8,8, n) with notation (number of local registers, number of overlap registers, and n, the number of windows present in a particular implementation). In a particular program, a sequence of calls may exceed the number of windows implemented. MRSs are usually arranged as a circular buffer so that when the set size is exceeded, the entry furthest back is stored in memory, allowing the called routine proper register space (overflow). After a routine has overflowed, when the program sequence later returns to a level that is not in the MRS, an underflow occurs,causing the register set most removed to be saved in memory, and the underflowed register space to be restored for the current routine.
Data buffers can be arranged as an even closer mapping of the run-time stack. Such buffers can be organized in at least two ways:
1. As a transparent buffer for the run-time stack (stack cache buffer [73]).
2. As a programmer-addressed buffer that is used in conjunction with the run-time stack in memory (contour buffer [92]).
As only the MRS approach is in common use and all these techniques perform roughly the same function, we discuss only this approach here.
All data buffers profit from register allocation. In evaluating the effectiveness of various kinds of data buffers, a fundamental difference is the kind of register allocation scheme used. Simple allocation, called one-one allocation, assigns commonly used variables to the available registers, but does not attempt to reuse those registers after they are no longer needed in a procedure. More sophisticated allocators analyze the flow of the data throughout the procedure to determine when a particular register is available for reuse; that register is then reassigned. These allocators are called many-one allocators or global allocators, since they act globally over a single procedure. Even more advanced allocators can do interprocedural allocation, that is, allocate registers on a many-one basis across procedures

 
< previous page page_129 next page >