next up previous
Next: The QR Method Up: The Rayleigh-Ritz Method: Previous: The Rayleigh-Ritz Method:

Rayleigh-Ritz, Background:

Let's consider more generally the case for $A$ an $n \times $ Hermitian matrix and ${\bf x}$ is an $n-$dimensional vector. We indicate ``hermitian'' as $\dagger$, which means that the complex conjugate of the transpose of $A$ is the same as $A$. So, for short, $A^\dagger = A$. If $A$ is a real matrix, $A^{T} = A$. These matrices often arise from self-adjoint continuous operators which model some physical process. The complex version appears often in the context of quantum mechanics and acoustics.

We will indicate by an overbar the operation of taking the conjugate transpose. If $A$ and ${\bf x}$ were real, this operation would simply involve the transpose.

Since $A^\dagger = A$ the eigenvalues are real and can be organized as

\begin{displaymath}
\lambda_{min}\equiv\lambda_1 \le \lambda_2 \le \lambda_3 \cdots \le
\lambda_n \equiv \lambda_{max}.
\end{displaymath}

We will see that the smallest and largest eigenvalues may be thought of as the solution to a constrained minimum and maximum problem.

Theorem (Rayleigh-Ritz): Let $A$ as above and the eigenvalues ordered as above. Then

\begin{displaymath}
\lambda_1 \overline{\bf x} {\bf x} \le
\overline{\bf x} A ...
...line{\bf x} {\bf x},
\qquad \mbox{for all} {\bf x} \in C^{n}.
\end{displaymath}

Furthermore,

\begin{displaymath}
\lambda_{max} = \max_{{\bf x} \ne 0} \frac{ \overline{\bf x}...
...x}}
{\overline{\bf x} {\bf x}} = \max_{\vert{\bf x}\vert^2=1},
\end{displaymath}

and

\begin{displaymath}
\lambda_{min} = \min_{{\bf x} \ne 0} \frac{ \overline{\bf x}...
... x}}
{\overline{\bf x} {\bf x}}= \min_{\vert{\bf x}\vert^2=1}.
\end{displaymath}

Proof: Since $A=A^\dagger$ then there exists a unitary matrix $U$ such that $A=U \Lambda U^{\dagger}$, with $\Lambda = diag(\lambda_1,
\lambda_2, \ldots, \lambda_n)$. For any ${\bf x} \in C^n$ we have

\begin{displaymath}
\overline{\bf x} A {\bf x} = \overline{\bf x} U \Lambda U^{...
...}) =
\sum_{i=1}^n \lambda_i \vert(U^\dagger {\bf x})_i\vert^2.
\end{displaymath}

Since $ \vert(U^\dagger {\bf x})_i\vert^2$ is non-negative, then

\begin{displaymath}
\lambda_{min} \sum_{i=1}^n \lambda_i \vert(U^\dagger {\bf x}...
...max} \sum_{i=1}^n \lambda_i \vert(U^\dagger {\bf x})_i\vert^2.
\end{displaymath}

Because $U$ is unitary

\begin{displaymath}
\sum_{i=1}^n \lambda_i \vert(U^\dagger {\bf x})_i\vert^2 =
\sum_{i=1}^n \vert{\bf x}_i\vert^2 = \overline{\bf x} {\bf x}.
\end{displaymath}

Hence,

\begin{displaymath}
\lambda_1 \overline{{\bf x}} {\bf x} =
\lambda_{min} \overli...
...rline{{\bf x}} {\bf x} = \lambda_n \overline{{\bf x}} {\bf x}.
\end{displaymath}

These are sharp inequalities. If ${\bf x}$ is an eigenvector of $A$ associated with $\lambda_1$, then

\begin{displaymath}
\overline{\bf x} A {\bf x} = \lambda_1 \overline{\bf x} {\bf x}.
\end{displaymath}

Same sort of argument holds for $\lambda_n$.

Furthermore, if ${\bf x} \ne 0$ then

\begin{displaymath}
\frac{ \overline{\bf x} A {\bf x}}{\overline{\bf x} {\bf x}} \le
\lambda_n.
\end{displaymath}

so
(85) $\displaystyle \max_{{\bf x} \ne 0} \frac{ \overline{\bf x} A {\bf x}}{\overline{\bf x}
{\bf x}}$ $\textstyle =$ $\displaystyle \lambda_n.$

Finally, since ${\bf x} \ne 0$, then

\begin{displaymath}
\frac{ \overline{\bf x} A {\bf x}}{\overline{\bf x} {\bf x}}...
... x}{\bf x}}}}A
\frac{{\bf x}}{\sqrt{\overline{\bf x}{\bf x}}}
\end{displaymath}

and

\begin{displaymath}
\overline{ \frac{{\bf x}}{\sqrt{\overline{\bf x}{\bf x}}}}
\frac{{\bf x}}{\sqrt{\overline{\bf x}{\bf x}}} = 1.
\end{displaymath}

Hence, (86) is equivalent to

\begin{displaymath}
\max_{\vert{\bf x}\vert^2=1 }
\overline{\bf x} A {\bf x}
= \lambda_n.
\end{displaymath}

Same sort of arguments hold for $\lambda_1$, in the context of the minimum. $\Box$

Algorithm

Now we will revert to the case of $A$ an $n\times n$ symmetric real matrix for the presentation of the algorithm.

Let ${\bf x}$ be an $n-$dimensional real vector. Choose some initial guess $x^{(0)}$, and compute

\begin{displaymath}
x^{(m+1)}=A x^{(m)}
\end{displaymath}

then

\begin{displaymath}
\lambda_1^{(m+1)}=\frac{(x^{m+1}, x^{(m)})}{(x^{(m)},
x^{(m)})}\quad m\ge 0
\end{displaymath}

where $(w,z)=\sum^n_{i=1}w_iz_i\quad w, z\in {\mathbb{R}}^n$ is the inner product.

In fact, by writing $x^{(m)}=\sum^n_{j=1}\alpha^{(m)}_j u_j$ then

\begin{displaymath}
\lambda_1 = \frac{\sum^n_{j=1}\vert\alpha^{(m+1)}_j\vert^{2}...
...m+1}}{
\sum^n_{j=1}\vert\alpha^{(m)}_j\vert^{2}\lambda_j^{2m}}
\end{displaymath}

hence, it is easy to see that

\begin{displaymath}
\lambda_1^{(m+1)}=
\lambda_1\left[1+\mathcal{O}\left(\left[\frac{\lambda_2}{\lambda_1}\right]^
{2m}\right)\right],
\end{displaymath}

which is quadratic convergence, an improvement over the previous method. $\Box$


next up previous
Next: The QR Method Up: The Rayleigh-Ritz Method: Previous: The Rayleigh-Ritz Method:
Juan Restrepo 2003-04-12