page_156

< previous page

page_156

Page 156



		Threads are special sub-processes that share the address space of a process. They promote the parallel execution of various functions of the operating system, and allow the system to support finer-grain parallel processing. The use of threads in parallelizing the operating system creates many small processes. These processes must communicate via some type of context switch. The additional cost of the context switch in the modern large register systems defeats the operating system designer's attempts to improve performance through parallel execution of the operating system.



		3.4 Breaks in Pipelined and Overlapped Machine Execution



		Certain aspects of program behavior disrupt the execution of an overlapped or pipelined host. Breaks or delays in execution occur, and instruction processing must be suspended until these events are resolved. These delays are usually caused by one of the following phenomena:



		1. Branching and similar control delays.



		2. Data dependencies.



		3. Run-on instructions.



		4. Memory hierarchy delays.



		Branching may cause significant delays, since a delay is usually associated with the fetching of the target instruction; thus, a successor instruction is delayed by the fetch time of the targeted instruction. Many operations depend on a preceding operation for completion. These data dependencies add further delay to program execution. Some instructions (e.g., floating point divide) take many more cycles than the bulk of instructions executed. These instructions, by their nature, require multiple EX cycles (or accesses to memory, etc.) before a result is available. This delay frequently cannot be masked by execution of other instructions, and adds to the delay in program execution. Figure 3.3 represents a possible allocation of delay for program execution. We model pipeline performance as the sum of the decode time or the minimum time allocated for the execution of any instruction, independent of dependencies or resource conflicts, plus the accumulation of delays due to these various sources. The sources of delay are assumed to be independent of one another; that is, a delay from one source cannot be masked by a simultaneous occurrence with delay from another source. This is generally true, but may be only approximate in certain cases. The model, assuming a linear independent composition of delays, provides a conservative estimate of performance. In order to evaluate the effects of the various delays, we must first determine the frequency of various program events that cause delays [170, 256].



		3.4.1 Instruction Run Length



		The number of instructions between taken branches defines the instruction run length. This run length is an important parameter in selecting a pipeline organization and an instruction buffer strategy. Figure 3.4 shows that the

< previous page

page_156