Choice of Optimization Method

A primary goal for the selection of an optimization method is to find a method that, given good starting points, requires only a few iterations for a large class of problems, so it would be desirable to use a second order method like Newton's method to find $t_\alpha$. The Newton iteration method for improving an estimate for $t_c$ for $t_\alpha$, successively replaces $t_c$ by $t_c - h(t_c)/h'(t_c)$. This method requires values for both $h(t)$ and $h'(t)$. If we make a simple change of variable ${\bf x}= t{\bf y}$ in the detailed expression for $h(t)$ determined from our definition of the MVT distribution function, we have (for the two-sided case)

\begin{eqnarray*}
h(t)&=&\alpha - 1 + \\
&&\hspace{-13mm}t^m\ \frac{2^{1-\frac{...
...{\bf y}^t \mbox{\boldmath$\Sigma$}^{-1} {\bf y}}{2}} d{\bf y}ds.
\end{eqnarray*}



Differentiating $h(t)$, and then changing the variables back to ${\bf x}$, we find

\begin{displaymath}h'(t) = \frac{1}{t}(mP(t) -H(t)),\end{displaymath}

where $H(t)$ is given by

\begin{displaymath}
\frac{2^{1-\frac{\nu}{2}}\vert\mbox{\boldmath $\Sigma$}\vert...
... x}^t \mbox{\boldmath $\Sigma$}^{-1} {\bf x}}{2}} d{\bf x}ds.
\end{displaymath}

For the one-sided case, the lower limits for the inner integral are all $-\infty$. In either case, $H(t)$ may be computed with only a little extra work during the computation of $P(t)$.

Given a starting interval $[t_a, t_b]$, with $t_\alpha \in [t_a,t_b]$. and a required error tolerance $\tau$ for our final estimate of $t_\alpha$, we let $t_c =\frac{t_a+t_b}{2}$ and use a Newton algorithm that repeats:
if $(t_b-t_a) > 2\tau$ and $\vert h(t_c)\vert > \tau \vert h'(t_c)\vert$, then

  1. if $h(t_c) <0$, set $(t_a, h(t_a)) = (t_c, h(t_c))$,
    otherwise set $(t_b, h(t_b)) = (t_c, h(t_c))$;
  2. set $t_c = t_c - h(t_c)/h'(t_c)$;
else stop and output $t_\alpha \approx \frac{t_a+t_b}{2}$ or $t_\alpha \approx t_c$.

We also investigated the use of Secant-like methods for solving $h(t)=0$. Some preliminary tests showed that the simple Secant method is not suitable for many problems because $\vert h'(t)\vert$ is sometimes very small near $t_\alpha$ (particularly when $\alpha$ is small), and this can result in divergence unless a very good starting value is available. Various bisection-Secant hybrid methods were considered and after some experiments, the ``Pegasus'' method (see Ralston and Rabinowitz, 1978) was selected. This method has asymptotic order of convergence similar to that of the Secant method and, at each iteration it provides a bracketing interval for $t_\alpha$. The Pegasus method that we finally implemented (starting with $\tau$ and $[t_a, t_b]$) initially sets $t_c =\frac{t_a+t_b}{2}$. If $h(t_c) <0$, we set $(t_a, h(t_a)) = (t_c, h(t_c))$; otherwise we set $(t_b, h(t_b)) = (t_c, h(t_c))$. Our basic iteration repeats:
if $(t_b-t_a) > 2\tau$ and $\vert h(t_c)\vert > \tau \vert\frac{h(t_b)-h(t_a)}{t_b-t_a}\vert$, then

  1. compute $t_c = t_b - h(t_b)(t_b - t_a)/(h(t_b)-h(t_a))$;
  2. if $h(t_c) <0$, set $(t_a, h(t_a)) = (t_c, h(t_c))$,
    otherwise set $h(t_a)=h(t_a)h(t_b)/(h(t_b)+h(t_c))$ and $(t_b, h(t_b)) = (t_c, h(t_c))$
else stop and output $t_\alpha \approx \frac{t_a+t_b}{2}$ or $t_\alpha \approx t_c$.
The Pegasus method is the same as the linearly convergent False-Position method except for the $h_a$ modification at step b), which improves the approximate order of convergence to $1.64$.




2003-02-17