|
|
|
|
|
|
|
Figure 8.43
Relative execution time for LU application. |
|
|
|
|
|
|
|
|
In Figures 8.408.43, we look at the performance of our five protocols (CD-INV, SDD, SCI, CD-UP, DD-UP) on a variety of application problems. Execution time is all relative to the central directory invalidate protocol (CD-INV). |
|
|
|
|
|
|
|
|
The applications represented are: |
|
|
|
|
|
|
|
|
1. (Figure 8.40) Multifrontal solver (M1k´ 1k array with 85% line utilization). |
|
|
|
|
|
|
|
|
2. (Figure 8.41) Partial differential equation (PDE32´32 array with 50% line utilization). |
|
|
|
|
|
|
|
|
3. (Figure 8.42) Sparse Cholesky factorization (SPCF1138 ´ 1138 array with 11.2% line utilization). |
|
|
|
|
|
|
|
|
The results (Figures 8.408.43) are clustered into groups for each figure. The first group shows the relative performance for the invalidate protocols (CD-INV, SDD, SCI). The second group shows the relative performance for the two update protocols (CD-UP, DD-UP) with block synchronization (BS). These two update protocols are then enhanced in the next three groups by: |
|
|
|
|
|
|
|
|
1. Use of a one-line (64B) local combining buffer (BS-C). |
|
|
|
|
|
|
|
|
2. Use of word synchronization (WS). |
|
|
|
|
|
|
|
|
3. Use of both (1) and (2) (WS-C). |
|
|
|
|
|
|
|
|
From the figures, we can draw some conclusions. Line utilization determines the efficiency of the update protocols. Low line utilization favors |
|
|
|
|
|