page_297

< previous page

page_297

Page 297



		ronments, a designer can safely adopt a no-duplicate policy and not allow a line to be present in both caches at the same time. The no-duplicate policy somewhat simplifies implementation complexity.



		Of course, the instruction cache as well as the data cache must be interrogated on all stores, as noted in Figure 5.30. In the event an entry is found in the instruction cache on a data access (or vice versa), this line must be made invalid in order to preserve the consistency of the memory system.



		5.10.2 Code Density Effects



		Instruction set architecture can have a significant effect on cache performance. More densely encoded architectures capture their program working set (most of the localities used in program execution) in fewer lines than a less densely encoded instruction set. This code density difference directly affects the cache miss rate of two instruction sets. Differences are naturally most dramatic for caches that contain only instructions (I-caches). Caches that contain only data (D-caches) are generally unaffected by instruction encoding and instruction set code density, since all instruction sets need basically the same data sets for program execution.



		For our three prototype architectures, we compute a code density relative to R/M (the DTMR reference), which can be determined by data in chapter 3 (Tables 3.2 and 3.3). For the scientific environment, we have the following relative code densities:



		and



		Mitchell [202] has shown that for small and very large caches, the relative miss rate of I-caches for two different instruction sets is directly related to the relative code densities:



		The preceding relationship is not true for intermediate size caches. For certain cache sizes, the relative performance spike occurs when the more dense architecture begins to capture its instruction working set, while the less dense architecture has not done so. Figure 5.31 illustrates this phenomenon for a particular program and several architectures of different code density. The spike produces a relative performance difference of about 3.0 when, at around 16KB, the reference instruction set captures its working set. Spikes as high as 20.0 have been noted for programs with well-defined working sets. All architectures profit from increasing cache

< previous page

page_297