Test Results

Three types of tests were used for integration methods based on the transformations described in Section 2. The first and third sets of tests used covariance matrices with a simple structure, where a different method could be used to compute $F$ and independently verify the accuracy of the algorithms. The second set of test used random covariance matrices, where no simple independent method of verifying the accuracy of the algorithms is available. In all of the tests the lower limits ${\bf a}$ were set equal to $-\infty$. For the first two test sets the requested absolute accuracy level $\epsilon$ was fixed, and for the third test set three different $\epsilon$ values were used.

A FORTRAN implementation of the algorithm in the previous section was compared with the Schervish algorithm and, an algorithm which uses a subregion adaptive multiple integration method for the numerical integration. This method is described in an article by Berntsen, Espelid and Genz [BEG 91]. The single precision FORTRAN implementation that was used for the tests reported in this section is called SCUHRE. This algorithm begins by applying a low degree (basic) integration rule to the whole integration region (a unit cube for these problems). If the error estimate is too large, the initial integration region is divided in half. The cut is made along the coordinate axis where the integrand is most rapidly changing. Next, the basic rule is applied to the two new subregions. If the total error estimate is still too large, the subregion with largest error is cut in half. The algorithm continues in this manner by subdividing, at each stage, the subregion with largest estimated error, until the sum of all of the subregion errors is less than $\epsilon$ or a limit on the total number of integrand evaluations has been reached. This type of algorithm concentrates the integrand evaluations in the subregions and along the directions where the integrand is most rapidly changing.

Initial tests with the SCUHRE code showed that although the degree seven basic rule used by SCUHRE often provided very accurate results, it required too much time for larger $m$ values. A degree five basic rule was therefore added to the code and used as the basic integration rule for the following tests. Initial tests with the modified SCUHRE code also showed that the error estimates provided by SCUHRE were usually much too conservative. A driver subroutine was therefore written for SCUHRE. The driver initially called SCUHRE with $N_0 = 100+10m^2$ allowed integrand evaluations. Repeated calls of SCUHRE were made, with $N_k = 2N_{k-1}$ new integrand evaluations allowed for the $k^{th}$ call. For each new call of SCUHRE, the computation began where the previous computation had been stopped. If $A_k$ and $E_{A_k}$ were used to denote the respective estimates for $F$ and the error in $F$ provided by SCUHRE, the driver subroutine continued the computation until $\vert A_k - A_{k-1}\vert + E_{A_k}/\sum_{j=0}^kN_j < \epsilon$. The results given in Tables 1-3 were obtained using this modified version of SCUHRE (with driver).

A driver was also written for the crude Monte-Carlo algorithm described in the previous section and this driver was used for the tests reported in this section. For this driver $N_0 = 100+10m^2$ was used for the initial $N_{max}$ in the Monte-Carlo algorithm. Subsequent calls to the Monte-Carlo algorithm used $N_{max} = N_k = 3N_{k-1}/2$. Let $M_k$ denote the Monte-Carlo estimate for $F$ using $N_k$ samples and $E_{M_k}^2 = \sigma_k^2/N_k$, where $\sigma_k^2$ is the sample variance for $M_k$. This driver subroutine continues the computation until $2.5/(\sum_{j=0}^k 1/E_{M_j}^2)^{\frac{1}{2}} < \epsilon $ and, for an estimate for $F$ it returns the weighted average $(\sum_{j=0}^kM_j/E_{M_j}^2)/(\sum_{j=0}^k1/E_{M_j}^2)$.

For the test results in this section, S, M and A are used to denote results from the Schervish algorithm, Monte-Carlo algorithm (with driver subroutine) and subregion adaptive algorithm (modified SCUHRE with driver subroutine); $E_S$, $E_M$ and $E_A$ are used to denote average absolute errors and $T_S$, $T_M$ and $T_A$ are used to denote average running times (in seconds) for the three algorithms. A Digital DECstation 3100 ($\simeq$ 14 mips) was used for the tests.

In the first set of tests, a constant correlation family of covariance matrices, defined by $\sigma_{i,j} = \rho$, for $i \neq j$ and $\sigma_{i,i} = 1$, was used. With these covariance matrices, $F$ has a value that can be accurately computed independently ([TONG 90], p. 192). This is given by

\begin{displaymath}
F({\bf b}) = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty}
e...
...2}
\prod_{i=1}^m \Phi((b_i + \sqrt{\rho}t)/\sqrt{1-\rho})  dt
\end{displaymath}

For each $m = 3, 4, ..., 10, 15, 20$, a sample of 50 different covariance matrices was used, with random $\rho$ values, chosen uniformly from (0,1). The upper limits were also random, chosen uniformly from [0,$\sqrt{m}$]. For $m > 6$, the Schervish software required rapidly increasing computation times and so results are not given for these cases. For $m = 5$ and $m = 6$, the time distributions were skewed, with approximately 80% of the particular computation times less than $T_S$ and one or two more five times larger than $T_S$. The results are reported in Table 1. Under each row in the table that is labeled with an $m$ value is a row of standard deviation values for the corresponding entries in row above.



Table 1: Constant $\sigma_{i,j}$ Covariance Matrix Results, $\epsilon = 0.005$
m $E_S$ $E_M$ $E_A$ $T_S$ $T_M$ $T_A$ $T_S$/$T_M$
3 0.00035 0.00108 0.00002 0.01 0.11 0.08 0.06
  0.00022 0.00095 0.00003 0.02 0.12 0.05  
4 0.00025 0.00116 0.00007 0.23 0.18 0.17 1.27
  0.00021 0.00121 0.00007 0.31 0.16 0.08  
5 0.00025 0.00142 0.00012 10.91 0.36 0.29 30.50
  0.00016 0.00096 0.00011 22.25 0.22 0.09  
6 0.00018 0.00118 0.00016 334.20 0.80 0.56 419.98
  0.00013 0.00102 0.00015 481.89 0.93 0.42  
7 - 0.00095 0.00018 - 1.13 0.93  
  - 0.00074 0.00019 - 0.83 0.11  
8 - 0.00104 0.00020 - 1.44 1.47  
  - 0.00088 0.00021 - 0.94 0.16  
9 - 0.00118 0.00021 - 1.59 2.08  
  - 0.00090 0.00021 - 0.82 0.19  
10 - 0.00118 0.00022 - 2.18 3.03  
  - 0.00105 0.00027 - 1.19 0.69  
15 - 0.00103 0.00032 - 6.13 10.03  
  - 0.00094 0.00032 - 2.81 0.90  
20 - 0.00081 0.00044 - 8.26 19.18  
  - 0.00065 0.00042 - 2.50 4.41  



A second set of tests used 50 random covariance matrices. These random covariance matrices were generated with a method described in an article by Marsaglia and Olkin [MO 94], for $m = 3, 4, ..., 10$. This method consists of the following steps. First, a random lower triangular matrix is generated with entries chosen uniformly from [-1,1]. Then, the rows of this matrix are normalized so that the sum of the squares of the elements in each row add to one. Finally this lower triangular matrix is multiplied by it's transpose, to produce a random covariance matrix. The upper integration limits for these tests were chosen in the same way that they were chosen for the previous tests. For $m = 3$ and $m = 4$ the Schervish software [SCHRV 84] was used for comparison. For $m > 4$, however, the Schervish software again required widely varying times for each test and would not terminate after 50 hours for one of the tests with $m = 5$ so it could not be used for $m > 4$. Since accurate values for $F$ were not known, the values used were obtained using the subregion adaptive software with requested absolute accuracy $\epsilon = 0.00025$. In Table 2 the results for these tests are reported.



Table 2: Random Covariance Matrix Results, $\epsilon = 0.005$
m $E_S$ $E_M$ $E_A$ $T_S$ $T_M$ $T_A$ $T_S$/$T_M$
3 0.00028 0.00166 0.00006 0.02 0.48 0.12 0.03
  0.00053 0.00146 0.00007 0.03 0.52 0.10  
4 0.00109 0.00164 0.00055 574.60 1.09 0.48 527.64
  0.00377 0.00152 0.00086 3889.45 1.02 0.36  
5 - 0.00146 0.00130 - 1.61 1.00  
  - 0.00127 0.00166 - 1.36 0.91  
6 - 0.00166 0.00174 - 0.81 0.87  
  - 0.00134 0.00186 - 0.80 0.91  
7 - 0.00194 0.00248 - 1.05 1.22  
  - 0.00121 0.00199 - 1.04 1.32  
8 - 0.00199 0.00269 - 0.80 1.48  
  - 0.00173 0.00225 - 0.58 1.32  
9 - 0.00188 0.00246 - 0.79 2.04  
  - 0.00146 0.00213 - 0.62 2.67  
10 - 0.00169 0.00239 - 1.35 2.25  
  - 0.00136 0.00212 - 1.08 1.78  



In order to investigate the effect of the size of the requested accuracy on the computation times for the the three algorithms, a third set of tests were performed. These tests used the same sample size and matrices as the first set of tests, but also used a decreasing sequence of three $\epsilon$ values ( $\epsilon = 0.01, 0.001, 0.0001$), for $m = 3, 4, 5, 6$. Results obtained without excessive workstation time are given in Table 3.



Table 3: Constant $\sigma_{i,j}$ Covariance Matrix Results, Decreasing $\epsilon$
m $\epsilon$ $E_S$ $E_M$ $E_A$ $T_S$ $T_M$ $T_A$
3 0.01 0.000515 0.00160 0.000025 0.006 0.03 0.06
  0.001 0.000062 0.00029 0.000023 0.007 1.22 0.07
  0.0001 0.000004 - 0.000004 0.009 - 0.21
4 0.01 0.000403 0.00170 0.000071 0.112 0.05 0.09
  0.001 0.000040 0.00032 0.000048 0.174 2.01 0.19
  0.0001 0.000004 - 0.000015 0.382 - 0.77
5 0.01 0.000445 0.00166 0.000118 3.978 0.07 0.14
  0.001 0.000040 0.00029 0.000089 9.255 3.87 0.28
  0.0001 0.000005 - 0.000020 26.82 - 1.68
6 0.01 - 0.00155 0.000166 - 0.13 0.23
  0.001 - 0.00031 0.000117 - 6.09 0.50
  0.0001 - - 0.000037 - - 2.92



For all of the tests the computation times for the transformation based algorithms increased slowly with $m$. For the smaller $m$ values, the subregion adaptive algorithm took less time than the Monte-Carlo algorithm and was more accurate, but for $m$ greater than 6 or 7 the Monte-Carlo algorithm usually took less time and had comparable accuracy. For the $m = 3$ the Schervish algorithm was faster than the other two algorithms, but these tests show that for $m$ greater than 4 or 5, it might not be feasible to use the Schervish algorithm for practical computations. For the random covariance matrix tests, the results suggest that either the subregion adaptive or Monte-Carlo algorithm can reliably produce multivariate normal probabilities accurate to two or three decimal digits in one or two seconds of DECstation 3100 time, for problems with as many as ten dimensions. Both the Schervish and the Monte-Carlo algorithms require rapidly increasing amounts of computation time when $\epsilon$ is decreased.




2004-11-30