next up previous contents
Next: Backward Differentiation Formulas (BDF's) Up: The INITIAL VALUE PROBLEM Previous: Convergence Analysis for ERK   Contents

Multi-step Methods

Multi-step methods come from the Quadrature Interpretation of original problem (see 0.1.3). These schemes use a number of previously computed approximate solutions at previous $x$-steps to find the solution at the current step. Why use a multi-step method? Can get higher order truncation errors and are generally efficient since the computations are usually elementary. Again, there are implicit and explicit multistep methods.

Take initial value problem and advance one step (again, consideration is limited to equally-spaced nodes in $x$). Multi-step methods can be written as

(42) $\displaystyle y_{n+1}=y_n+\int^{x_{n+1}}_{x_n}f(t,y(t))dt=y_{n-r}+ \int^{x_{n+1}}_{x_{n-r}}f(t,y(t))dt.$

ADAMS-BASHFORTH FORMULA (AB)

      $\displaystyle y_{n+1}=y_n+af_n+bf_{n-1}+cf_{n-2}\cdots$
      where $\displaystyle f_n\equiv f(x_n,y_n),y_n=y(x_n)$
      $\displaystyle \qquad a,b,c,\ldots$ are constants

Example: Adams-Bashforth, order 5:

$\displaystyle AB5: y_{n+1}=y_n+\frac{h}{720}[1901f_n+2616f_{n-2}+251f_{n-4}
-2774f_{n-1}-1274f_{n-3}]
$

Coefficients come from the following:

(43) $\displaystyle \int^{x_{n+1}}_{x_n}f(t,y(t))dt\approx h[Af_n+Bf_{n-1}+Cf_{n-2} +Df_{n-3}+Ef_{n-4}]$

and require that (43) be exact when $f$ is a polynomial of degree at most $4$. Let us consider how the AB5 is constructed: Denote ${\cal P}_k$ be the family of polynomials of degree at most $k$. Recast problem of finding coefficients into a linear algebra problem . . . Without loss of generality, work this out at $x_n=0$ and assume that $h=1$.

$\displaystyle {\cal P}_4=\{ 1,x,x(x+1),x(x+1)(x+2),x(x+1)(x+2)(x+3)\}
\equiv\{ p_0,p_1,p_2,p_3,p_4\}$ is a basis

then enforce

$\displaystyle \int^1_0p_n(t)dt=Ap_n(0)+Bp_n(-1)+Cp_n(-2)+Dp_n(-3)+Ep_n(-4)
$

obtain:

$\displaystyle \left\{\begin{array}{r}
A+B+C+D+E=1 \\
-B-2C-3D-4E=\frac12 \\
2C+6D+12E=\frac56\\
-6D-24E=\frac94 \\
24E=\frac{251}{30}\end{array}\right.
$

This is a ``Method of Constant Coefficients'', a general technique that can be used to obtain any order formula (see 475A notes on quadrature techniques).

Remark. One can generate ${\cal P}_n$ basis conveniently using Newton difference formulas (see 475A notes on Newton difference formulas). In fact, it is a good idea to review notes on quadrature and on interpolation, in particular, Gaussian and Chebychev Quadrature and interpolation, to draw conclusions on whether it makes sense to use a nonuniform grid from a practical point of view.

Exercise) Why are these not used in initial value problems all that often, if at all?

ADAMS-MOULTON FORMULAS (AM)

Assume (42) can be written as

      $\displaystyle y_{n+1}=y_n+af_{n+1}+bf_n+cf_{n-1}\cdots$
      $\displaystyle \hspace{1in}\nearrow$
         new

Example: (AM5)

(44) $\displaystyle y_{n+1}=y_n+\frac{h}{720}[251f_{n+1}+646f_n+106f_{n-2}-264f_{n-1} -19f_{n-3}]$

derived by Method of Undetermined Coefficients (exercise).

Note the appearance of $f_{n+1}$, making this method implicit.

How to advance (44) in $n$?

Answer: use $AB5$ as ``predictor'' then an $AM5$ ``corrector'':

$\displaystyle \begin{array}{lll}
ABN & \tilde y_{n+1}=y_n+af_n+bf_{n-1}\cdots &...
...\\
AMN & y_{n+1}=y_n+p\tilde f_{n+1}+qf_n\ldots & \mbox{Corrector}
\end{array}$

where $N$ is the order (want to use same order for predictor and corrector, usually)

   and $\displaystyle \left\{\begin{array}{l}
f_n=f(x_n,y_n)  \tilde f_n=f(x_n,\tilde y_n)
\end{array}\right.
$

How to start integration?

Commonly $\to$ use ERKN (explicit Runge-Kutta of order $N$) to find enough $y_n$'s for the multi-step to take over.

Another way $\to$ use incremental-order incremental-sized-step multi-step method.

Another way $\to$ use above in conjunction with fixed point iteration to find the implicit part of the AM stage.

Exercise) write down in detail the algorithmic strategies to accomplish these last two choices above.

Multi-Step Scheme Convergence and Stability

Assume $x_n=x_0+nh$

      $\displaystyle n=0,1,2\ldots N(h)$
      $\displaystyle b=x_0+N(h)h$
      $\displaystyle h=\displaystyle \frac{b-x_0}{N(h)}.$

As usual, let $y_n=y(x_n)$.

Express multi-step method as

(45)     $\displaystyle y_{n+1}=\displaystyle \sum^p_{j=0}a_jy_{n-j}+h\sum^p_{j=-1}b_jf(x_{n-j},
y_{n-j})$
      $\displaystyle x_{p+1}\le x_{n+1}\le b$

For the initial value problem

$\displaystyle \left\{\begin{array}{l}
Y'=f(x,Y)  Y(0)=Y_0\end{array}\right.
$

with $Y=Y(x),f$ continuous. Also assume there exists a Lipschitz constant.

Some definitions: we say

Theorem. Convergence implies consistency.

Proof. (will be proven in context of numerical solution of partial differential equations, later discussed in this class) (see 0.4). $\Box$

Stability of Multi-step Methods
that is, of

$\displaystyle y_{n+1}-\sum^p_{j=0}a_jy_{n-j}-h\sum^p_{j=-1}b_jf(x_{n-j},y_{n-j})=0.
$

which is a difference equation. We want to consider in detail the issue of stability of multi-step methods. Recall from our consideration of Difference Equation solutions to the Null-space problem (see See Difference Equations.) that it is sensible to assume that there exists a polynomial associated with (45) of the form

$\displaystyle \rho(r)=r^{p+1}-\sum^p_{j=0}a_jr^{p-j}
$

Note: $\rho(1)=0$ from consistency condition:

$\displaystyle \begin{array}{l}
\displaystyle \sum^p_{j=0}a_j=1 \\
\sum^p_{j=0}...
...d}\\
\mbox{shows that}\\
\leftarrow\\
\mbox{implies consistency)}\end{array}$

Let $r_0=1,r_1,r_2\ldots r_p$ be roots. Then

(45) satisfies ``Root Condition'' if

      $\displaystyle \vert r_j\vert\le 1\qquad j=0,1,\ldots p$
  for   $\displaystyle \vert r_j\vert=1\Rightarrow$ these must be simple roots.

THE BIG PICTURE:

$\displaystyle \begin{array}{ccccc}
& & \mbox{Strong} & \Rightarrow & \mbox{Rela...
...box{Root} & \Leftarrow &
\mbox{Stability} \\
& & \mbox{Conditions}
\end{array}$

Theorem. (Stability) Suppose (45) is consistent. Then (45) is stable if and only if the root condition is satisfied.

Example:

      $\displaystyle y_{n+1}=3y_n-2y_{n-1}+\displaystyle \frac{h}{2}[f(x_n,y_n)-3f(x_{n-1},
y_{n-1})]\quad n\ge 1$
      $\displaystyle T_n(Y)=\displaystyle \frac{7}{12}h^3Y'''(\xi_n)\qquad x_{n-1}\le\xi_n
\le x_{n+1}$
      consider $\displaystyle \left\{\begin{array}{c}
y'=0  y(0)=0\end{array}\right.\qquad\Rightarrow Y(x)=0$

So if $y_0=y_1=0$ and $y_n=0  n\ge 0$. Perturbation $z_0=\varepsilon /2,
z_1=\varepsilon \Rightarrow$

$\displaystyle z_n=\varepsilon 2^{n-1}\quad n\ge 0
$

So
      $\displaystyle \displaystyle \max_{x_0\le x_n\le b}\vert y_n-z_n\vert=\max_{0\le x_n\le b}\vert\varepsilon \vert
2^{n-1}=\vert\varepsilon \vert 2^{N(h)-1}$
      and $\displaystyle N(h)\to\infty$ as % latex2html id marker 22747
$\displaystyle h\to 0\therefore$   unstable

Compute $\rho(r)=r^2-3r+2$ with roots % latex2html id marker 22751
$ r_0=1,r_1=2\therefore$ Root condition violated. Since we're at it, we should probably also do the more general problem of looking at a system

$\displaystyle \left\{\begin{array}{ll}
{\bf y}'={\bf f}(x,{\bf y}) & {\bf y}\in {\cal R}^m \\
{\bf y}(0)={\bf y}_0 & {\bf f}\in{\cal R}^m\end{array}\right.
$

if $f$ is differentiable $\Rightarrow J\equiv\frac{\partial
f_i}{\partial y_j}\quad1\le i, j\le m$


  $\displaystyle \Rightarrow$   $\displaystyle {\bf y}'=\wedge{\bf y}+{\bf g}(x)\qquad m\times m$    system
      with $\displaystyle \wedge=f_y({\bf x}_0,{\bf Y}_0)$   with $\displaystyle \lambda_1,\lambda_2,\ldots\lambda_m$ eigenvalues.

We can can make some headway in understanding what happens in the system case by considering what happens in the simpler problem

$\displaystyle \left\{\begin{array}{l} y'=\lambda y y(0)=1\end{array}\right.
$

Using (45)
      $\displaystyle y_{n+1}=\displaystyle \sum^p_{j=0}a_jy_{n-j}+h\lambda\sum^p_{j=-1}
b_jy_{n-j}$
      $\displaystyle (1-h\lambda b_{-1})y_{n+1}-\displaystyle \sum^p_{j=0}(a_j+h\lambda b_j)
y_{n-j}=0\quad n\ge p$

``homogeneous linear differential equation of order $p+1$''

See Difference Equations.

To solve, let

$\displaystyle y_n=r^n\quad n\ge 0
$

and hope to find $p+1$ linearly independent solutions so that any solution can be expressed as a linear combination.

Substitute $r^n$ and cancel $r^{n-p}$

(46) $\displaystyle (1-h\lambda b_{-1})r^{p+1}-\displaystyle \sum^p_{j=0} (a_j+h\lambda b_j)r^{p-j}=0.$

   Let $\displaystyle \sigma(r)\equiv b_{-1}r^{p+1}+\displaystyle \sum^p_{j=0}b_j
r^{p-j},
$

(45) becomes

(47) $\displaystyle \rho(r)-h\lambda\sigma(r)=0$

known as the ``characteristic equation.''

Denote roots as $r_0(h\lambda),\ldots r_p(h\lambda)$ depending continuously on $h\lambda$. When $h\lambda=0$ (47) becomes

      $\displaystyle \rho(r)=0$   so$\displaystyle \qquad r_j(0)=r_j\quad j=0,1\ldots p.$
      $\displaystyle \hspace{.75in}\nearrow$
         roots of $\displaystyle \rho(r).$

If $r_j(h\lambda)$ are distinct then

$\displaystyle y_n=\sum^p_{j=0}\gamma_j[r_j(h\lambda)]^n\quad n\ge 0
$

if $r_j(h\lambda)$ has a root of multiplicity $\nu>1$ construct extra $\nu$ linearly independent by

$\displaystyle [r_j(h\lambda)]^n,n[r_j(h\lambda)]^n,\ldots n^{\nu-1}
[r_j(h\lambda)]^n\ldots
$

$\Box$

Proof. (Stability Theorem)

Here we present a simplified proof (see Isaacson and Keller '66 for full proof).

1) Assume root condition satisfied.

2) Roots are distinct $r_j(0)$ and $r_j(h\lambda)\quad 0<h\le h_0$.

Take $z_n$ and $y_n$ as solutions to

$\displaystyle (1-h\lambda b_{-1})y_{n+1}-\sum^p_{j=0}(a_j+h\lambda b_j)
y_{n-j}=0$   on $\displaystyle [x_0,b]
$

let $e_n=y_n-z_n$ and assume

(48) $\displaystyle \max_{0\le n\le p}\vert y_n-z_n\vert\le\varepsilon \quad 0\le h\le h_0$


(49)     % latex2html id marker 22834
$\displaystyle \therefore e_{n+1}(1-h\lambda b_{-1})-\sum^p_{j=0}
(a_j+h\lambda b_j)e_{n-j}=0$   for $\displaystyle x_{p+1}\le x_{n+1}\le b$
      with solution $\displaystyle e_n=\sum^p_{j=0}\gamma_j[r_j(h\lambda)]^n
\quad n\ge 0$

The coefficients must satisfy

$\displaystyle \left[\begin{array}{c}
\gamma_0+\gamma_1\cdots \gamma_p=e_0 \\
\...
...r_0(h\lambda)]^p+\cdots\gamma_p [\gamma_P(h\lambda)]^p
= e_p\end{array}\right]
$

then $e_0\ldots e_p$ will satisfy (49)

Using linear theory and (48) $\Rightarrow\displaystyle \max_{0\le i\le p}
\vert\gamma_i\vert\le c_1\varepsilon  0<h\le h_0$

To bound $e_n$ on $[x_0, b]$ we must bound each $[r_j(h\lambda)]^n$. To do so, consider

(50) $\displaystyle r_j(u)=\gamma_j(0)+ur'_j(\zeta)$

for some $\zeta$ between 0 and $u$ (variation of parameters). Compute $r'_j$: differentiate characteristic equation

$\displaystyle \rho(r_j(u))-u\sigma(r_j(u))=0
$

Therefore

(51) $\displaystyle r'_j(u)=\displaystyle \frac{\sigma(r_j(u))} {\rho'(r_j(u))-u\sigma'(r_j(u))}$

by assumption $r_j(0)$ simple root of % latex2html id marker 22866
$ \rho(r)=0\quad 0\le j\le
p\therefore \rho'(r_j(0))\ne 0$ and by continuity, $\rho'(r_j(u))\ne 0$ for all $u$ small % latex2html id marker 22872
$ \therefore$ denominator not zero and

$\displaystyle \vert r'_j(u)\vert\le c_2\qquad\forall \vert u\vert\le u_0$   for some $\displaystyle u_0>0.
$

Using (50) and the root condition: $\vert r_j\vert\le 1$, we get
      $\displaystyle \vert r_j(h\lambda)\vert\le \vert r_j(0)\vert+c_2\vert h\lambda\vert\le 1+c_2
\vert h\lambda\vert$
       
      $\displaystyle \vert[r_j(h\lambda)]^n\vert\le [1+c_2\vert h\lambda\vert]^n\le
e^{c_2n\vert h\lambda\vert}\le e^{c_2(b-x_0)\vert\lambda\vert}  \forall
h\le h_0.$
       
      % latex2html id marker 22884
$\displaystyle \therefore\max_{x_0\le x\le b}\vert ...
...t\le c_3\vert\varepsilon \vert e^{c_2
(b-x_0)\vert\lambda\vert}\quad 0<h\le h_0$

$\Box$

Theorem. (Convergence, Dahlquist Equivalence Theorem) Assume scheme is consistent. Then (45) is convergent if and only if root condition is satisfied.

Proof. Assume root condition is satisfied. Again, general proof in Isaacson and Keller. Assume $r_j(0)$ distinct.

Again

$\displaystyle \left\{\begin{array}{l}
y'=\lambda y  y(x_0)=1\end{array}\right.
$

and

$\displaystyle \gamma_0[r_0(h\lambda)]^n
$

of

$\displaystyle y_n=\displaystyle \sum^p_{j=0}\gamma_j[r_j(h\lambda)]^n
$

converges to solution $Y(x)=e^{\lambda x}$ on $[x_0, b]$. The remaining terms $\gamma_j[r_j(h\lambda)]^nj=1,\ldots p$ are parasitic and shown to $\to 0$ as $h\to 0$.

Expand $r_0(h\lambda)=r_0(0)+h\lambda r'(0)+{\cal O}(h^2)$.

   From (51)$\displaystyle \qquad r_0(0)=\frac{\sigma(1)}
{\rho'(1)}
$

and using consistency condition $\displaystyle \sum^p_{j=0}a_j=1$ and $\sum^p_{j=0}ja_j+\sum^p_{j=-1}b_j=1$ leads to $r_0'(0)=1$. Then $r_0(h\lambda)=1+h\lambda+{\cal O}(h^2)=e^{\lambda
h}+{\cal O}(h^2)$

$\displaystyle [r_0(h\lambda)]^n=e^{\lambda nh}[1+{\cal O}(h^2)]^n=e^{\lambda x_n}
[1+{\cal O}(h)]
$

over $x_0\le x_n\le b$ finite.

Thus

$\displaystyle \max_{0\le x_n\le h}[\vert r_0(h\lambda)\vert^n-e^{\lambda x_n}]\to 0$    as $\displaystyle h\to 0
$

We must now show that the coefficient $\gamma_0\to 1$ as $h\to 0$. Again $\gamma_0(h)\cdots\gamma_p(h)$ satisfy
(52) $\displaystyle \left[\begin{array}{c}
\gamma_0+\cdots\gamma_p=y_0 \\
\gamma_0[r...
...mma_0[r_0(h\lambda)]^P+\cdots \gamma_p [r_p(h\lambda)]^p=y_p
\end{array}\right.$    

the initial values $y_0\cdots y_p$ depend on $h$ and satisfy
      $\displaystyle \eta(h)\equiv\max_{0\le n\le p}\vert e^{\lambda x_n}-
y_n\vert\to$    as $\displaystyle h\to 0
$
  $\displaystyle \Rightarrow$   $\displaystyle \lim_{h\to 0}y_n=1\quad 0\le n\le p$

The coefficient $\gamma_0\to 1$ as $h\to 0$ (look at solution of linear system (52) and see that by Cramer's the denominator converges to Vandermonde determinant for $r_0(0)=1,r_1(0),\ldots
r_3(0)$ nonzero and distinct roots. Same for numerator.

% latex2html id marker 22952
$\displaystyle \therefore \{ y_n\}\to e^{\lambda x}$ as $\displaystyle h\to 0
$ on $\displaystyle [x_0,b].
$

$\Box$

Corollary. If (42) consistent. Then convergent if and only if stable.

Proof. Follows directly from above theorems.

$\Box$

Relative and Weak Stability:

Consider

$\displaystyle \left\{\begin{array}{l} y'=\lambda y y(0)=1\end{array}\right.
$

and the general solution $y_n=\displaystyle \sum^p_{j=0}\gamma_j[r_j(h\lambda)]^n,\quad n\ge 0$.

The convergence theorem states that parasitic solutions of $\gamma_j[r_j(h\lambda)]^n\to 0$ as $h\to 0$. However, we use finite $h$ and want, for any $x_n$, for them to be small compared to $\gamma_0[r_0(h\lambda)]^n$.

(53)     need% latex2html id marker 22975
$\displaystyle \therefore  \Rightarrow \vert r_j(h\lambda)\vert\le\vert r_0(h
\lambda)\vert\quad j=1,2,\ldots p$
      for $\displaystyle h$ sufficiently small.

This leads to concept of ``relative stability.''

A method is ``relatively stable'' if the characteristic roots $r_j(h\lambda)$ satisfy (53) for all sufficiently small $\vert\lambda h\vert$. And a method satisfies the ``strong root condition'' if

$\displaystyle \vert r_j(0)\vert<1$   for $\displaystyle j=1,2,\ldots p
$

Easy to check and it implies ``relative stability.''

Remark. Relative Stability does not imply the strong root condition (although they're equivalent for most methods). If multi-step method is stable but not relatively stable, it is ``weakly stable.''

Example: Using the bf midpoint method defined as $y_{n+1}=y_{n-1}+2hf(x_n,y_n),n\ge 1$ to solve

   and $\displaystyle \left\{\begin{array}{l}
Y'=\lambda Y  Y(0)=1\end{array}\right.
$

with exact solution

$\displaystyle Y(x)=e^{\lambda x}.
$

Take $y_n=r^n$    $n\ge 0$
  $\displaystyle r^{n+1}=r^{n-1}+2h\lambda r^n$   $\displaystyle \Rightarrow r^2=1+2h\lambda r$
  $\displaystyle r_0=h\lambda+\sqrt{1+h^2\lambda^2}$   $\displaystyle r_1=h\lambda-\sqrt{1+
h^2\lambda^2}$

so general solution

(54) $\displaystyle y_n=\beta_0r^n_0+\beta_1r_1^n,\quad n\ge 0$

$\displaystyle \left\{\begin{array}{l}
\beta_0+\beta_1=y_0  \beta_0r_0+\beta_1r_1=y_1\end{array}\right.
$

% latex2html id marker 23009
$\displaystyle \therefore\beta_0=\frac{y_1-r_1y_0}{r_0-r_1}\qquad\beta_1=
\frac{y_0r_0-y_1}{r_0-r_1},$ generally

using initial condition as above $\Rightarrow y_0=1,y_1=e^{\lambda h}$
      $\displaystyle \beta_0=\displaystyle \frac{e^{\lambda h}-r_1}{2\sqrt{1+h^2\lambda^2}}
=1+{\cal O}(h^2\lambda^2)$
      $\displaystyle \beta_1=\frac{r_0-e^{\lambda h}}{2\sqrt{1+\lambda^2h^2}}=
{\cal O}(h^3\lambda^3)$

$\beta_0\to 1$ $\beta_1\to 0$ as % latex2html id marker 23022
$ h\to 0\therefore\beta_0r^n_0$ in (54) should correspond to true solution $e^{\lambda x_n}$, since $\beta_1r_1^n\to 0$ as $h\to 0$. In fact

$\displaystyle r_0^n=e^{\lambda x_n}[1+{\cal O}(h)]
$

Now, assume $\lambda$ is real and positive (for illustration)

   then $\displaystyle r_0>\vert r_1\vert>0
$

thus $r_1^n$ increases less rapidly than $r^n_0$ so $\beta_0r^n_0$ will dominate. Now, assume $\lambda$ is real and negative

   then $\displaystyle 0<r_0<1\quad r_1<-1\quad h>0
$

% latex2html id marker 23046
$ \therefore\beta_1r^n_1$ will dominate $\beta_0r^n_0$ as $n\to\infty$, for fixed $h$, no matter how small $h$. The $\beta_0r_0^n\to 0$ as $n\to\infty$, whereas the term $\beta_1r^n_1$ increases, alternating sign as $n\to\infty$.

The $\beta_1r_1^n$ is the ``parasitic'' solution (a creation of the numerical method) $\Rightarrow$ Midpoint method is ``weakly stable'' $\ldots$ the parasitic solution will eventually make the solution diverge from the solution.

In summary, the midpoint method is weakly stable according to (53) since

$\displaystyle r_0(h\lambda)=1+h\lambda+{\cal O}(h^2)\quad r_1(h\lambda)=-1+h\lambda
+{\cal O}(h^2)
$

for $\lambda <0$.

Example: Try AB and AM. They have same characteristic polynomial when $h=0$:

$\displaystyle \rho(r)=r^{p+1}-r^p
$

The roots are % latex2html id marker 23078
$ r_0=1,r_j=0  j=1,\ldots p\therefore$ Strong Root condition is satisfied and Adams methods are relatively stable.

$\Box$


next up previous contents
Next: Backward Differentiation Formulas (BDF's) Up: The INITIAL VALUE PROBLEM Previous: Convergence Analysis for ERK   Contents
Juan Restrepo 2003-05-02