Tests

The tests reported in this section compare FORTRAN implementations of six methods: Schervish's method (MULNOR), acceptance-rejection sampling (REJNRM) Deák's method (with n = 3, SPHNRM) and Genz's Monte-Carlo (RANNRM), adaptive (SADNRM) and lattice rule (KRONRM) methods. Randomized Korobov rules (see Keast, 1973 and Cranley and Patterson, 1976) were used for KRONRM. All implementations require input parameters: m, $\Sigma$, ${\bf b}$ and $\epsilon$, a requested absolute accuracy. The estimated absolute accuracy was computed using 3e where e is the standard error for the sample, for the implementations of the methods where random sampling was used (SPHNRM, RANNRM and KRONRM).

Each test run applied selected methods to fifty randomly generated test problems. Most of the tests used a constant correlation $\Sigma$with elements $\sigma_{i,j} = \sigma$, for $i \neq j, \sigma_{i,i} = 1$. In this case, $P({\bf b})$ can be independently and accurately computed (Tong, 1990) using

\begin{displaymath}P({\bf b}) = \frac{1}{\sqrt{2\pi}}\int_{-\infty}^{\infty} e^{...
...}
\prod_{i=1}^m \Phi((b_i + \sqrt{\rho}t)/\sqrt{1-\rho}) \ dt.
\end{displaymath}

For each individual test, a random $\sigma$ was generated uniformly from [0,1] and m random bi's were generated uniformly from $[0,\sqrt(m)]$. For each method tested, average absolute errors and average times were computed with averages taken over fifty samples. Respective standard deviations were also computed. In all tests the average errors and their standard deviations were scaled by $\epsilon^{-1}$, and the average times are in seconds. A Personal DECstation 5000/25 was used for all of the computations, which were carried out in single precision.

The first test runs used $\epsilon = 0.01$ and the results for selected methods are given in Tables 1 and 2. The results for each m in these and subsequent tables are presented in two rows. The first row contains actual scaled average errors and average times, and second row contains respective standard deviations for these quantities. Some of the table entries for average time are zero; these occur when the average was less than 0.005 seconds. Similarly, zero average scaled errors indicate actual values less than 0.005.




Table 1: Constant $\sigma_{i,j}$ results, $\epsilon = 0.01$
  REJNRM SPHNRM MULNOR
m E T(s) E T(s) E T(s)
3 0.24 0.55 0.25 0.11 0.06 0.00
sd's 0.17 0.24 0.18 0.08 0.04 0.00
4 0.24 0.62 0.29 0.07 0.05 0.08
sd's 0.18 0.31 0.18 0.05 0.03 0.12
5 0.23 0.77 0.23 0.11 0.04 1.47
sd's 0.19 0.37 0.17 0.04 0.03 2.07
6 0.20 0.95 0.16 0.17 0.04 118.60
sd's 0.14 0.46 0.11 0.06 0.03 226.46
7 0.29 0.79 0.13 0.37 - -
sd's 0.21 0.42 0.10 0.09 - -
8 0.23 0.88 0.11 0.52 - -
sd's 0.20 0.49 0.08 0.07 - -
E = scaled average error and T = average time

MULNOR was not used for m > 6 because of its rapidly increasing computation times. The REJNRM average times for m > 8gradually increased to approximately two seconds at m = 14. The SPHNRM average times for m > 8gradually increased to approximately six seconds at m = 14. All three methods appear to be robust and reliable at this level of requested accuracy.




Table 2: Constant $\sigma_{i,j}$ results, $\epsilon = 0.01$
  RANNRM SADNRM KRONRM
m E T E T E T
3 0.13 0.01 0.00 0.00 0.00 0.08
sd's 0.16 0.00 0.00 0.01 0.00 0.00
4 0.16 0.01 0.00 0.01 0.00 0.12
sd's 0.19 0.01 0.01 0.00 0.00 0.01
5 0.15 0.02 0.01 0.01 0.01 0.16
sd's 0.12 0.02 0.01 0.00 0.01 0.01
6 0.12 0.02 0.01 0.02 0.01 0.21
sd's 0.13 0.02 0.02 0.00 0.01 0.05
7 0.14 0.03 0.00 0.03 0.00 0.25
sd's 0.13 0.02 0.01 0.00 0.00 0.02
8 0.13 0.03 0.01 0.06 0.05 0.29
sd's 0.12 0.02 0.03 0.01 0.08 0.03
9 0.12 0.04 0.00 0.10 0.00 0.34
sd's 0.11 0.02 0.01 0.01 0.01 0.04
10 0.13 0.05 0.00 0.18 0.00 0.38
sd's 0.13 0.02 0.01 0.01 0.00 0.03
11 0.15 0.06 0.01 0.35 0.00 0.43
sd's 0.14 0.05 0.01 0.04 0.00 0.02
12 0.10 0.07 0.01 0.72 0.01 0.49
sd's 0.11 0.04 0.01 0.06 0.02 0.05
13 0.15 0.08 0.02 0.52 0.00 0.53
sd's 0.15 0.06 0.02 0.71 0.01 0.04
14 0.14 0.09 0.03 0.89 0.06 0.60
sd's 0.13 0.06 0.02 1.29 0.06 0.09
E = scaled average error and T = average time




All three of the methods RANNRM, SADNRM and KRONRM are robust and reliable, and usually faster and more accurate (except for m = 3 and m = 4 ) than the other three methods. There is not much variation in time taken by a particular method until m reaches ten or so.

The next tests used $\epsilon = 0.001$ (Tables 3 and 4).




Table 3: Constant $\sigma_{i,j}$, $\epsilon = 0.001$
  SPHNRM MULNOR
m E T E T
3 0.19 10.50 0.05 0.00
sd's 0.16 6.48 0.03 0.00
4 0.27 7.84 0.04 0.13
sd's 0.20 3.86 0.03 0.18
5 0.22 9.93 0.03 8.08
sd's 0.13 5.25 0.02 15.52
6 0.28 6.96 - -
sd's 0.19 4.39 - -
7 0.22 14.58 - -
sd's 0.16 7.92 - -
8 0.25 14.11 - -
sd's 0.19 6.81 - -



Table 4 Constant $\sigma_{i,j}$ results, $\epsilon = 0.001$
  RANNRM SADNRM KRONRM
m E T E T E T
3 0.22 0.24 0.02 0.00 0.01 0.10
sd's 0.19 0.52 0.02 0.00 0.01 0.04
4 0.20 0.85 0.04 0.01 0.03 0.14
sd's 0.14 1.18 0.05 0.03 0.05 0.06
5 0.20 0.96 0.03 0.01 0.09 0.17
sd's 0.18 2.30 0.05 0.00 0.13 0.05
6 0.20 0.76 0.04 0.02 0.04 0.21
sd's 0.17 1.36 0.04 0.01 0.04 0.01
7 0.25 1.25 0.06 0.04 0.05 0.26
sd's 0.30 2.63 0.12 0.02 0.07 0.08
8 0.27 1.79 0.04 0.06 0.13 0.64
sd's 0.18 3.06 0.03 0.01 0.15 0.41
9 0.22 1.40 0.04 0.13 0.03 0.35
sd's 0.21 2.34 0.04 0.08 0.04 0.04
10 0.18 1.58 0.05 0.23 0.03 0.39
sd's 0.17 2.01 0.04 0.11 0.03 0.03
11 0.22 3.23 0.06 0.45 0.04 0.43
sd's 0.16 5.84 0.06 0.27 0.05 0.02
12 0.19 4.01 0.05 0.84 0.04 0.56
sd's 0.15 10.46 0.04 0.38 0.06 0.36
13 0.21 5.97 0.09 3.05 0.04 0.56
sd's 0.14 8.84 0.16 3.08 0.06 0.16
14 0.19 3.57 0.09 4.05 0.16 0.94
sd's 0.14 5.96 0.21 3.26 0.24 0.42
E = scaled average error, and T = average time

Results for REJNRM are not given, but some tests were done. Typical times were approximately one hundred times longer than the times required for $\epsilon = 0.01$. The SPHNRM times show a similar pattern, as is expected from a Monte-Carlo method, where the error should decrease by a factor of ten when the number of sample points increases by a factor of one hundred. All three of the methods RANNRM, SADNRM and KRONRM continue to be robust and reliable, and except maybe for m = 3 or 4, are faster than the other methods. The RANNRM times increased by a factor of approximately one hundred compared to the times for $\epsilon = 0.01$, as is expected for a Monte-Carlo method. Overall, SADNRM is faster for m < 12 and then KRONRM is faster.

The final tests for the constant $\sigma$ covariance matrices used $\epsilon = 0.0001$ (Table 5).

Table 5: Constant $\sigma_{i,j}$, $\epsilon = 0.0001$
  SADNRM KRONRM
m E T E T
3 0.11 0.00 0.05 0.09
sd's 0.11 0.00 0.05 0.02
4 0.25 0.01 0.15 0.19
sd's 0.42 0.01 0.16 0.21
5 0.20 0.03 0.19 0.33
sd's 0.21 0.03 0.27 0.16
6 0.18 0.13 0.20 1.18
sd's 0.20 0.13 0.14 1.86
7 0.13 0.28 0.16 0.99
sd's 0.13 0.23 0.21 0.84
8 0.17 0.54 0.22 4.89
sd's 0.17 0.45 0.33 5.77
9 0.17 1.24 0.37 6.69
sd's 0.08 0.90 1.04 15.18
10 0.16 2.36 0.28 14.72
sd's 0.16 1.82 0.38 24.11
11 0.14 4.60 0.20 21.43
sd's 0.10 4.33 0.24 46.30
12 0.16 12.40 0.17 28.64
sd's 0.13 9.17 0.22 50.60

The last tests use $\epsilon = 0.01$ with ``completely'' random $\Sigma$'s for each $F({\bf b})$. For this test run, fifty random $\Sigma$'s were generated using a method described by Marsaglia and Olkin (1984). With this method, a lower triangular matrix $\hat{C}$ is first generated, with elements uniformly random from [-1,1]. The columns of $\hat{C}$ are then scaled so that they have unit 2-norms and positive diagonal entries. The result is a lower trangular matrix Cthat is used to produce a random covariance matrix $\Sigma = CC^t$. For each test run, fifty of these random covariance matrices were generated for each m; the random ${\bf b}$'s were generated in the same manner as they were generated for the other tests. Because exact $P({\bf b})$ values were not known for these test problems ``accurate'' $P({\bf b})$ values were computed using KRONRM with $\epsilon = .0005$. Selected test results for are given in Tables 6 and 7.



Table 6: Random $\sigma_{i,j}$ Results, $\epsilon = 0.01$    
  REJNRM SPHNRM MULNOR    
m E T E T E T    
3 0.24 0.82 0.27 0.14 0.05 0.01    
sd's 0.17 0.45 0.23 0.10 0.04 0.01    
4 0.20 0.96 0.27 0.08 0.04 4.96    
sd's 0.17 0.48 0.21 0.05 0.04 23.76    
5 0.22 1.18 0.20 0.15 - -    
sd's 0.17 0.61 0.17 0.08 - -    
6 0.27 1.27 0.15 0.17 - -    
sd's 0.24 0.65 0.14 0.01 - -    
7 0.23 1.19 0.15 0.43 - -    
sd's 0.20 0.73 0.12 0.18 - -    
8 0.74 1.34 0.63 0.68 - -    
sd's 3.54 0.84 3.55 0.35 - -    



Table 7: Random $\sigma_{i,j}$ Results, $\epsilon = 0.01$    
  SPHNRM RANNRM SADNRM    
m E T E T E T    
3 0.16 0.08 0.25 0.01 0.03 0.10    
sd's 0.16 0.12 0.61 0.01 0.06 0.04    
4 0.21 0.11 0.19 0.01 0.05 0.14    
sd's 0.19 0.19 0.36 0.01 0.07 0.07    
5 0.16 0.10 0.15 0.01 0.07 0.19    
sd's 0.13 0.13 0.20 0.01 0.07 0.09    
6 0.25 0.13 0.16 0.02 0.07 0.21    
sd's 0.20 0.13 0.20 0.02 0.05 0.04    
7 0.24 0.15 0.13 0.05 0.11 0.28    
sd's 0.19 0.22 0.13 0.04 0.11 0.15    
8 0.24 0.15 0.13 0.07 0.14 0.45    
sd's 0.18 0.16 0.14 0.05 0.16 0.32    
9 0.27 0.13 0.12 0.29 0.08 0.36    
sd's 0.20 0.13 0.11 0.89 0.08 0.08    
10 0.21 0.19 0.16 1.07 0.08 0.42    
sd's 0.17 0.18 0.15 5.20 0.09 0.21    
11 0.20 0.23 0.21 2.93 0.11 1.12    
sd's 0.15 0.22 0.28 10.95 0.15 4.42    
12 0.19 0.23 0.19 4.33 0.12 0.60    
sd's 0.15 0.23 0.23 11.48 0.19 0.54    
13 0.25 0.20 0.16 11.98 0.06 0.55    
sd's 0.21 0.23 0.32 28.41 0.08 0.11    
14 0.20 0.24 0.36 16.62 0.12 0.68    
sd's 0.16 0.42 1.19 44.78 0.11 0.27    
E = scaled average error and T = average time    

The ``random $\sigma_{i,j}$'' problems are apparently harder for the different methods. SADNRM appears to be best for m < 10, and then KRONRM is faster.



 

Alan C Genz
1999-10-21