page_294

< previous page

page_294

Page 294



		5.10 Split I- and D-Caches and the Effect of Code Density



		Multiple caches can be incorporated into a single processor design, each cache serving a designated process or use. Over the years, special caches for systems code and user code or even special I/O caches have been considered. The most popular configuration of partitioned caches is the use of separate caches for instructions and data.



		A Note on Split Caches



		Certain older programming environments, especially Fortran, apparently intermixed data and instructions rather frequentlyenough to degrade performance significantly when duplicate lines were not allowed to appear in both data and instruction cache [259]. Usually, however, the designer can safely adopt a "no duplicate policy," where lines are contained in either the instruction or the data cache but not allowed to be present in both caches at the same time. This avoids some implementation complexity. In any event, the instruction cache as well as data cache should be interrogated on all stores; if an entry is found in the instruction cache, this line should be made invalid in order to preserve the consistency of the memory system.



		5.10.1 I- and D-Caches



		Separate instruction and data caches offer the designer the possibility of significantly increased cache bandwidth, potentially doubling the access capability of the cache ensemble. Split (I/D) caches have become especially useful in L/S microprocessors whose instruction set increases I-bandwidth requirements. Split caches come at some expense, however; a unified cache with the same size as the sum of a split data and instruction cache gives a lower effective miss rate. Figure 5.27 illustrates this. In the unified cache, the ratio of instruction to data working set elements changes during the execution of the program and is adapted to by the replacement strategy. No such adaptation is possible in the split cache.



		Split caches offer some implementation advantages. Since the caches need not be split equally, there may be certain environments where a 75-25 or other split may prove more effective. Also, the I-cache is not required to manage a processor store. If a store into the instruction cache is detected, the line is simply invalidated, as this is presumably an unlikely occurrence (i.e., contrary to modern programming practice).



		The DTMR for the instruction cache is presented in Figure 5.28, and for the data cache in Figure 5.29. In comparing these figures, note the difference in spatial locality between the instruction cache and the data cache. Larger lines are more effective in small instruction caches than in data caches. The DTMR is for a fully associative LRU cache for our reference R/M architec-

< previous page

page_294