QR Comparison with other Implementations

We developed a QR decomposition algorithm, based on the orthogonalisation process of Gram-Schmidt in a series of posts here, here, here, and here. Let’s have a look how good this algorithm performs against built-in implementations from julia and other programming languages.

Test criteria

For a given matrix $A$ we get a decomposition $A=QR$ where $Q$ is orthogonal and where $R$ is upper triangular. If $A$ does not have full column rank, then some columns in our matrix $Q$ will be 0 (at least in our implementation). Thus, we will test the following things:

  1. $Q^\top Q = I$ if $A$ had full rank
  2. $QR = A$
  3. $R = Q^\top A$
  4. $Q = AR^{-1}$

In our tests below we will use the backslash operation to compute $AR^{-1}$ (i.e. no explicit inversion of $R$).

Evaluation (full rank Hilbert matrices)

We test the following methods:

  1. the julia builtin QR decomposition qr()
  2. our QR decomposition with reorthogonalisation (using 10 and 1+$M_\varepsilon$ as thresholds for the criteria of Rutishauser to decide whether we should orthogonalize).
  3. our QR decomposition with reorthogonalisation and R matrix update in each reorthogonalisation (using 10 and 1+$M_\varepsilon$) as thresholds for the criteria of Rutishauser to decide whether we should orthogonalize.

Updating R won’t have any impact on the orthogonality of $Q$, but it will influence the results of $QR-A$ and $R-Q^\top A$.

Using 1+$M_\varepsilon$ as threshold (i.e. enforcing as many reorthogonalisations as possible) we get the following results. Note that in each plot the y-axis is expressed in multiples of the machine epsilon $M_\varepsilon$.

Orthogonality

Benchmark orthogonality

Our implementation is slightly better than the built-in QR decomposition.

Accuracy in the reconstruction

For the accuracy of the reconstruction we obtain the following plot.

Benchmark reconstruction

Again we consistently outperform the built-in implementation by a tiny margin. Updating R yields a slight benefit in certain cases, but overall the effect is almost negligible. The reason might be that we rarely do more than 1 and almost never more than 2 reorthogonalisations. Thus, there is very little opportunity to cause impact.

Accuracy in R

For the accuracy in R we obtain the following plot.

Benchmark R

Again the results are similar to the previous test. However, this time the performance of R update is reversed.

If instead we use the default threshold value of 10, then we get the following graphs.

Benchmark orthogonality
Benchmark reconstruction
Benchmark R

Overall we lose a bit of accuracy. It’s interesting to note that in this case there’s no visible difference anymore between updating R or not.

Comparison to other algorithms

If we compare our Gram-Schmidt based implementation to a QR decomposition based on Givens rotations or Householder transformations we get the following results. Here we evaluate Hilbert matrices with a size of up to $768\times 768$ and perform a least squares fit on the error to get an overview on the behaviour. In these experiments we use our implementation with the default threshold of 10 for the reorthogonalisation and the $R$ matrix update turned on.

Other algorithms benchmark orthogonality

As we can see, our implementation performs similarly to the julia built-in QR decomposition and slightly better than the Givens rotations algorithm. The Householder algorithms deteriorates slowly with the size of the Hilbert matrix.

The quality of the reconstructions is given in the next plot.

Other algorithms benchmark accuracy

Interestingly, the reconstruction accuracy of the Householder algorithm is better than the Givens algorithm. The former performs similarly to our Gram-Schmidt algorithm and the built-in QR implementation.

The next two plots show how good $Q^\top A$ approximates $R$ and similarly, how good $AR^{-1}$ (computed with the backslash operator) approximates $Q$.

Other algorithms benchmark accuracy

Other algorithms benchmark accuracy

Remarks

In terms of run time the Householder-based implementation performed similarly than the Gram-Schmidt-based code. Both were significantly faster than the Givens rotations. Indeed, the former two approaches work on a column of the matrix in each iteration whereas the Givens rotations work element wise. Which also means that the Givens rotations become a lot more interesting for sparse matrices.

Notes

All the code was evaluated with Julia version 1.5.2 (2020-09-23) using the official Julia Docker image. The code to create the plots and run the evaluations is available here. The code is licensed with a 3 clause BSD licence. The code for the algorithms is available from gitlab. The zip file also includes a jld2 file with all the data necessary to recreate the plots.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

Changelog

  • 2020/11/15: Added comparisons to implementations with Householder transformations and Givens rotations.
  • 2020/12/20: Updated the plots for the comparison with Householder transformations and Givens rotations with more data points. Made the code to evaluate the data available.
Mathematician and
Software Engineer

Researcher, Engineer, Tinkerer, Scholar, Philosopher, and Hyrox afficionado