Laurent Hoeltgen

Gram-Schmidt vs. Modified Gram-Schmidt

2020-08-30, update: 2021-08-17 ·  6 min read  ·  Mathematics

We compare the accuracy of the classical Gram-Schmidt algorithm to the modified Gram-Schmidt algorithm.

Problem Formulation

Given kk vectors xjRnx_j\in\mathbb{R}^n, we would like to find kk vectors yjRny_j\in\mathbb{R}^n such that the yjy_j build an orthonormal system, i.e.

yj=1jandyj,yi=0ji \lVert y_j \rVert = 1\quad\forall j\quad\text{and}\quad\langle y_j, y_i \rangle = 0\quad\forall j\neq i

and

span{y1,..yj}=span{x1,..xj}j=1,...,k \mathrm{span}\lbrace y_1, .. y_j\rbrace = \mathrm{span}\lbrace x_1, .. x_j\rbrace\quad \forall j = 1, ..., k

The most popular algorithm to solve this problem is probably the Gram-Schmidt orthogonalisation. The algorithm is named after Erhardt Schmidt and Jørgen Pedersen Gram. Schmidt published the algorithm in 1907 [5] but claims in his work that the method can already be found in Grams work from 1883 [6]. The first reference to the "Gram-Schmidt" Algorithm probably comes from [7]. Finally, a well readable historical overview is given in [4]

Classical Gram-Schmidt

Finding y1y_1 is rather straightforward. The first requirement tells us that y1y_1 must have length 1 and the second equation tells us that it must be parallel to x1x_1. The obvious choice is therefore to set

y1=x1x1 y_1 = \frac{x_1}{\lVert x_1 \rVert}

In order to get the second vector lets have a look at our current situation in the sketch below.

Getting the second vector

We know pp. It's given by p=x2,y1y1p = \langle x_2, y_1\rangle y_1. Thus it follows that the vector going from pp to x2x_2 is given by x2px_2 - p. From the sketch it's obvious that x2px_2 - p will be orthogonal to y1y_1 and thus a perfect candidate for y2y_2. This gives us an idea for a general formula

yj=xji=1j1xj,yiyi y_j = x_j - \sum_{i=1}^{j-1} \langle x_j, y_i\rangle y_i

and we end up with the following algorithm

Classical Gram-Schmidt Algorithm:
1. set $y_1 = \frac{x_1}{\lVert x_1\rVert}$
2. for $j = 2, ..., k$ compute:
   $$\begin{align*}
   y_j &= x_j - \sum_{i=1}^{j-1} \langle x_j, y_i\rangle y_i \\
   y_j &= \frac{y_j}{\lVert y_j\rVert}
   \end{align*}$$

In Julia, classical Gram-Schmidt can be implemented as follows.

using LinearAlgebra

function classical_gram_schmidt_alt(matrix)
    # orthogonalises the columns of the input matrix
    num_vectors = size(matrix)[2]
    orth_matrix = zeros(size(matrix))
    for vec_idx = 1:num_vectors
        orth_matrix[:, vec_idx] = matrix[:, vec_idx]
        sum = zeros(size(orth_matrix[:, 1]))
        for span_base_idx = 1:(vec_idx-1)
            # compute sum
            sum += dot(orth_matrix[:, span_base_idx], orth_matrix[:, vec_idx])*orth_matrix[:, span_base_idx]
        end
        orth_matrix[:, vec_idx] -= sum
        # normalise vector
        orth_matrix[:, vec_idx] = orth_matrix[:, vec_idx]/norm(orth_matrix[:, vec_idx])
    end
    return orth_matrix
end

Modified Gram-Schmidt

Modified Gram-Schmidt performs the very same computational steps as classical Gram-Schmidt. However, it does so in a slightly different order. In classical Gram-Schmidt you compute in each iteration a sum where all previously computed vectors are involved. In the modified version you can correct errors in each step.

Modified Gram-Schmidt Algorithm:

  1. set yj=xjy_j = x_j for all jj.

  2. for j=1,...,kj = 1, ..., k compute:

yj=yjyjyi=yiyj,yiyji=j+1,...,k\begin{align*} y_j &= \frac{y_j}{\lVert y_j\rVert} \\ y_i &= y_i - \langle y_j, y_i\rangle y_j\quad \forall i = j + 1, ..., k \end{align*}

If you omit the normalisations, then classical Gram-Schmidt computes

y2=x2x2,y1y1y3=x3x3,y1y1x3,y2y2y4=x4x4,y1y1x4,y2y2x4,y3y3\begin{align*} y_2 &= x_2 - \langle x_2, y_1 \rangle y_1 \\ y_3 &= x_3 - \langle x_3, y_1 \rangle y_1 - \langle x_3, y_2 \rangle y_2 \\ y_4 &= x_4 - \langle x_4, y_1 \rangle y_1 - \langle x_4, y_2 \rangle y_2 - \langle x_4, y_3 \rangle y_3 \end{align*}

whereas modified Gram-Schmidt computes

y2=x2x2,y1y1,y3=x3x3,y1y1,...yk=xkxk,y1y1,y3=y3y3,y2y2=x3x3,y1y1x3,y2y2...yk=ykxk,y2y2=xkxk,y1y1xk,y2y2...\begin{align*} y_2 &= x_2 - \langle x_2, y_1 \rangle y_1, \\ y_3 &= x_3 - \langle x_3, y_1 \rangle y_1, \\ &...& \\ y_k &= x_k - \langle x_k, y_1 \rangle y_1, \\ \\ y_3 &= y_3 - \langle y_3, y_2 \rangle y_2 = x_3 - \langle x_3, y_1 \rangle y_1 - \langle x_3, y_2 \rangle y_2 \\ &...& \\ y_k &= y_k - \langle x_k, y_2 \rangle y_2 = x_k - \langle x_k, y_1 \rangle y_1 - \langle x_k, y_2 \rangle y_2 \\ &...& \end{align*}

If you consider the strategy from a block wise point of view, then you'll notice that you orthogonalize each of your vectors in each step based on your previous orthogonalisation. This does not happen in the classical Gram-Schmidt version but allows you to correct for errors that happened in previous blocks.

In Julia, modified Gram-Schmidt can be implemented as follows.

using LinearAlgebra

function modified_gram_schmidt(matrix)
    # orthogonalises the columns of the input matrix
    num_vectors = size(matrix)[2]
    orth_matrix = copy(matrix)
    for vec_idx = 1:num_vectors
        orth_matrix[:, vec_idx] = orth_matrix[:, vec_idx]/norm(orth_matrix[:, vec_idx])
        for span_base_idx = (vec_idx+1):num_vectors
            # perform block step
            orth_matrix[:, span_base_idx] -= dot(orth_matrix[:, span_base_idx], orth_matrix[:, vec_idx])*orth_matrix[:, vec_idx]
        end
    end
    return orth_matrix
end

The benefits of using the modified Gram-Schmidt algorithm is visible in the plot below. As input, I used different sizes of a regularized version of the Hilbert matrix and computed the operator norm of the matrix IQQI - Q^\top Q, where II is the identity matrix and QQ is our output matrix. In a perfect setting the error should be 0.

error_cgs = []
error_mgs = []
for n = 1:10
    H = [1.0/(i+j-1) for i=1:2^n, j=1:2^n] + 0.00001 * I # regularised Hilbert Matrixy
    eye = Matrix{Float64}(I, 2^n, 2^n)                   # Identity Matrix
    H_cgs = classical_gram_schmidt(H)
    H_mgs = modified_gram_schmidt(H)
    append!(error_cgs, norm(eye - transpose(H_cgs)*H_cgs, Inf))
    append!(error_mgs, norm(eye - transpose(H_mgs)*H_mgs, Inf))
end

Please note that the error is expressed in multiples of machine epsilon here.

Error evaluation

As we can see, the error is increasing much faster in classical Gram-Schmidt than in modified Gram-Schmidt.

It is interesting to note that the following formulation of Gram-Schmidt algorithm (which is really just a better implementation of classical Gram-Schmidt) is referred to as the modified Gram-Schmidt by Schwarz and Rutishauser in [3].

using LinearAlgebra

function classical_gram_schmidt(matrix)
    # orthogonalises the columns of the input matrix
    num_vectors = size(matrix)[2]
    orth_matrix = copy(matrix)
    for vec_idx = 1:num_vectors
        for span_base_idx = 1:(vec_idx-1)
            # substrack sum
            orth_matrix[:, vec_idx] -= dot(orth_matrix[:, span_base_idx], orth_matrix[:, vec_idx])*orth_matrix[:, span_base_idx]
        end
        # normalise vector
        orth_matrix[:, vec_idx] = orth_matrix[:, vec_idx]/norm(orth_matrix[:, vec_idx])
    end
    return orth_matrix
end

Notes

All the code was evaluated with Julia version 1.5.0 (2020-08-01) using the official Julia Docker image.

References

[1] N. J. Higham, Accuracy and Stability of Numerical Algorithms, Society for Industrial and Applied Mathematics, 2002

[2] G. H. Golub and C. F. Van Loan, Matrix Computations, The John Hopkins University Press, 1996.

[3] W. Gander, Algorithms for the QR-Decomposition, Eidgenössische Technische Hochschule, Research Report No. 80-02, 1980. Retyped 2003

[4] S. J. Leon, A. Björck, and W. Gander, “Gram-Schmidt orthogonalization: 100 years and more,” Numerical Linear Algebra with Applications, vol. 20, no. 3, Art. no. 3, 2012

[5] E. Schmidt, “Zur Theorie der linearen und nichtlinearen Integralgleichungen I. Teil: Entwicklung willkürlicher Funktionen nach Systemen vorgeschriebener,” Mathematische Annalen, vol. 63, pp. 433–476, 1907.

[6] J. P. Gram, “Ueber die Entwickelung reeller Functionen in Reihen mittels der Methode der kleinsten Quadrate,” Journal for die Reine und Angewandte Mathematik, vol. 94, pp. 41–73, 1883.

[7] Y. K. Wong, “An application of orthogonalization process to the theory of least squares,” Annals of Mathematical Statistics, vol. 6, pp. 53–75, 1935.

Changelog

How to cite this page

Hoeltgen, Laurent: Gram-Schmidt vs. Modified Gram-Schmidt, 2021-08-17.

BibLaTeX code:
@online{gram-schmidt,
  author   = {Hoeltgen, Laurent},
  title    = {Gram-Schmidt vs. Modified Gram-Schmidt},
  date     = {2021-08-17},
  language = english
  url      = {https://www.laurenthoeltgen.name/content/blog/
              gram-schmidt}
}
Download BibLaTeX file

CC BY-SA 4.0 Laurent Hoeltgen. Last modified: September 19, 2025.
Website built with Franklin.jl and the Julia programming language.
Privacy Policy · Terms