Condition Number and Least Squares
APAC-ANU Teaching Module
In this tutorial we introduce the ideas of conditioning and
stability and then talk about standard least squares algorithms such
as the normal equations, Gram-Schmidt and Householder. It might be
useful to check out these
notes as you attempt the
tutorial.
We use S
CILAB, a freely available scientific computing
environment like M
ATLAB. Check out our
Online
SCILAB Tutorials for information on using (and setting up)
S
CILAB.
You might find the
notes of use in tackling these
exercises.
Exercise: 1 (
Backward and Forward Error)
The first exercise is a pen and paper type question. The intention
is to clarify the difference between forward and backward errors and
to give an example of how the condition number connects the two
concepts. Once you have finished this question please wait before
moving onto the next question, we will discuss the answers in the
tutorial.
- The sine function is given by the infinite series
|
sin(x) = x − |
x3
3!
|
+ |
x5
5!
|
− |
x7
7!
|
+ … |
|
- What are the forward and backward errors if we approximate the
sine function by using only the first term in the series, i.e.
sin(x) ≈ x, for x = 0.1, 0.5 and 1.0? Answer this
question by filling in the following table:
| | x = 0.1 | x = 0.5 | x = 1.0 |
| |f(x) − [^f](x)|/|f(x)| | 1.6686 ×10−3 | | |
| | x−[^x]|/|x| | | | 5.7080 ×10−1 |
- What are the forward and backward errors if we approximate the sine function by using the first two terms in the series, i.e., sin(x) ≈ x − x3/6, for x = 0.1, 0.5 and 1.0?
Answer this question by filling in the following table:
| | x = 0.1 | x = 0.5 | x = 1.0 |
| |f(x) − [^f](x)|/|f(x)| | | 5.3996×10−4 | |
| | x−[^x]|/|x| | | 5.8992 ×10−4 | |
- Consider the problem of computing values for the sine function for arguments near 2π. Let x ≈ 2π and let h be a small perturbation to x.
- Find an approximation to the absolute and relative errors of sin(x) at x ≈ 2π using the following relations;
|
Absolute Error = |f(x+h)−f(x)| ≈ h|f′(x)| |
|
|
Relative Error = |
|f(x+h)−f(x)|
|f(x)|
|
≈ h |
|f′(x)|
|f(x)|
|
|
|
- What does this tell us about the condition number for sin(x) around x = 2π?
Exercise: 2 (
Subintervals)
The following question focuses on floating point arithmetic. The first part of the question is, once again, a pen and paper type question. You may want to refer to the floating point arithmetic model to justify your answer. The second part of the questions is a programming exercise. To save time, I have included some of the code in the notes; read through it and add any sections that are missing.
Suppose you need to generate n+1 equally spaced points on the interval [a, b], with spacing h = (b−a)/n.
- In floating-point arithmetic, which of the following methods,
|
x0 = a, xk = xk−1 + h, k = 1, …, n, |
|
or
|
xk = a + kh, k = 0, …, n, |
|
is better, and why?
- Write a program implementing both methods and find an example, say, with a = 0 and b = 1, that illustrates the difference between them. You may want to add to the following template (which is not complete):
/////////////////////////////////////////////////////////////////////
// specify the interval to be subdivided
/////////////////////////////////////////////////////////////////////
a = 0;
b = 1;
/////////////////////////////////////////////////////////////////////
// set the number of subintervals (note that you may want to start
// with a smaller value of n)
/////////////////////////////////////////////////////////////////////
n = 50000+1;
/////////////////////////////////////////////////////////////////////
// define the size of each subinterval
/////////////////////////////////////////////////////////////////////
h = (b-a)/n;
/////////////////////////////////////////////////////////////////////
// initialise storage for the subintervals based on the first
// algorithm given in the notes
/////////////////////////////////////////////////////////////////////
x = ones(n);
/////////////////////////////////////////////////////////////////////
// initialise storage for the subintervals based on the second
// algorithm given in the notes
/////////////////////////////////////////////////////////////////////
y = ones(n);
/////////////////////////////////////////////////////////////////////
// calculate the subintervals using the two different algorithms
/////////////////////////////////////////////////////////////////////
/////////////////////////////////////////////////////////////////////
// plot the difference between the two subintervals
/////////////////////////////////////////////////////////////////////
xbasc();
s = 100;
plot2d(1+(0:s:n), abs(y(1+(0:s:n))-x(1+(0:s:n))))
Exercise: 3 (
Cancellation Error)
This question is designed to explore the effects of cancellation error and to highlight the importance of careful programming. This is done by by looking at a very familiar equation, the quadratic formula. At first glance this may appear to be a trivial formula to implement, yet a careful implementation requires approximately two dozen lines of code.
- Write a function to solve the quadratic equation ax2+bx+c = 0 using the standard quadratic formula
Your function should accept values for the coefficients a, b and c as input and produce the two roots of the equation as output.
Recall that in SCILAB the general form of a function is:
function[x] = quadratic_root(a, b, c)
.
.
.
endfunction;
The variable x is a vector containing the two roots.
Test your program using the following values for the coefficients and print the results;
| a | b | c |
| 6 | 5 | -4 |
| 6 ×1030 | 5 ×1030 | −4 ×1030 |
| 0 | 1 | 1 |
| 1 | −106 | 1 |
| 1 | -4 | 3.999999 |
| 10−30 | −1030 | 1030 |
Below is a copy of my main file that tests the results. I saved my quadratic_root function in a file called quad.sci.
/////////////////////////////////////////////////////////////////////
// read in the routine to calculate the roots
/////////////////////////////////////////////////////////////////////
getf('quad.sci')
/////////////////////////////////////////////////////////////////////
// set the format of the output
/////////////////////////////////////////////////////////////////////
format('e',12);
/////////////////////////////////////////////////////////////////////
// run the first test problem
/////////////////////////////////////////////////////////////////////
a = 6
b = 5
c = -4
x = quadratic_root(a, b, c)
a*x(1)^2+b*x(1)+c
a*x(2)^2+b*x(2)+c
/////////////////////////////////////////////////////////////////////
// run the second test problem
/////////////////////////////////////////////////////////////////////
a = 6*10^{30}
b = 5*10^{30}
c = -4*10^{30}
x = quadratic_root(a, b, c)
a*x(1)^2+b*x(1)+c
a*x(2)^2+b*x(2)+c
/////////////////////////////////////////////////////////////////////
// run the third test problem
/////////////////////////////////////////////////////////////////////
a = 0
b = 1
c = 1
x = quadratic_root(a, b, c)
a*x(1)^2+b*x(1)+c
a*x(2)^2+b*x(2)+c
/////////////////////////////////////////////////////////////////////
// run the forth test problem
/////////////////////////////////////////////////////////////////////
a = 1
b = -10^5
c = 1
x = quadratic_root(a, b, c)
a*x(1)^2+b*x(1)+c
a*x(2)^2+b*x(2)+c
/////////////////////////////////////////////////////////////////////
// run the fith test problem
/////////////////////////////////////////////////////////////////////
a = 1
b = -4
c = 3.999999
x = quadratic_root(a, b, c)
a*x(1)^2+b*x(1)+c
a*x(2)^2+b*x(2)+c
/////////////////////////////////////////////////////////////////////
// run the sixth test problem
/////////////////////////////////////////////////////////////////////
a = 10^{-30}
b = -10^{30}
c = 10^{30}
x = quadratic_root(a, b, c)
a*x(1)^2+b*x(1)+c
a*x(2)^2+b*x(2)+c
- When testing your program you will most likely come across a problem with the second set of test data. This is a result of underflow and overflow. You should guard against unnecessary overflow and underflow by dividing both sides of the equation by the largest coefficient. Modify your root finding function to do this and check if the code now works on the test data.
- You will probably notice that the third and last test problem are still returning large errors. Try to make your program robust when given unusual input values. Any root that is within the range of the floating-point system should be computed accurately, even if the other is out of range. Your program should detect where the roots are imaginary, but need not use complex arithmetic explicitly. Once again, modify your root fining function and check how the program performs on the test data.
- You should now find that all of the test problems are giving results that are within order of the machine precision (given by %eps in SCILAB), except for the fourth test problem. This is an example of cancellation error. Recall that cancellation errors occur when you try to subtract two large numbers that are similar to one another. If b2 >> |4ac| then √{b2 − 4ac} ≈ |b| so for one choice of the sign, the standard quadratic equation suffers from bad cancellation errors. To avoid the problem, obtain the larger root, say x1, from
|
x1 = |
|
−(b + sign(b) | √
|
b2 − 4ac
|
) |
2a
|
if a ≠ 0. |
|
and x2 by
Finally, adjust your function to avoid cancellation errors and you should now find that your program works well on all of the test problems.
Exercise: 4 (
Gaussian Elimination)
The Hilbert matrix, defined below, is commonly used as a test matrix because it is poorly conditioned. The theme of this exercise is to use the Hilbert matrix to show that a small residual does not necessarily imply a small error.
An m ×m Hilbert matrix
H has entries h
ij = [1/(i+j−1)], so it is of the form
The following piece of code may be used to generate a Hilbert matrix of size m ×m. You will also need to generate the m-vector
b
=
Hx, where
x is the m-vector with all of its components equal to 1 (
x = ones(m, 1)).
/////////////////////////////////////////////////////////////////////
// define an m*m Hilbert matrix
/////////////////////////////////////////////////////////////////////
function[H] = Hilbert(m)
H = zeros(m, m);
for i = 1:m
H(i, :) = i:m-1+i
end
H = 1.0./H;
endfunction;
Use the SCILAB routines listed below to solve the resulting linear system
Hx
=
b obtaining an approximate solution [^(
x)].
/////////////////////////////////////////////////////////////////////
// Using a collection of elementary matrix operations convert the
// system Ax = b into Ux=b' where U is an upper triangular matrix
/////////////////////////////////////////////////////////////////////
function[U, be] = GE(A, b)
epsilon_tol = %eps*10;
// check that the matrix sizes make sense
[nr, nc] = size(A);
[br, bc] = size(b);
if nr ~= nc then
error('A must be square')
end;
if br ~= nr then
error ('Size of A and b are not consistent')
end;
// initialise the upper triangular matrix
U = A;
be = b;
// find the solution vector using the forward substitution algorithm
for j = 1:nc
if abs(U(j, j)) < epsilon_tol then
error('A has a zero pivot');
end;
for i = j+1:nc
m = U(i, j)/U(j, j)
U(i,j:nc) = U(i, j:nc)-m*U(j, j:nc)
be(i) = be(i)-m*be(j)
end;
end
endfunction;
/////////////////////////////////////////////////////////////////////
// Assuming that A is an upper triangular matrix, solve Ax = b using
// the back substitution algorithm
/////////////////////////////////////////////////////////////////////
function[x] = backsub(A, b)
epsilon_tol = %eps*10;
// check that the matrix sizes make sense
[nr, nc] = size(A);
[br, bc] = size(b);
if nr ~= nc then
error('A must be square')
end;
if br ~= nr then
error ('Size of A and b are not consistent')
end;
// initialise the solution vector
x = zeros(nc, 1);
// find the last element of the solution vector
if abs(A(nc, nc)) < epsilon_tol then
error('A is singular');
end;
x(nc) = b(nc)/A(nc, nc);
// find the solution vector using the back substitution algorithm
for i = nr-1:-1:1
if abs(A(i, i)) < epsilon_tol then
error('A is singular');
end;
x(i) = (b(i)-A(i, i+1:nc)*x(i+1:nc))/A(i, i)
end
endfunction;
Compute the ∞-norm of the residual
r
=
b−
Hx and of the error
e
=
x− [^(
x)], where
x is the vector of all ones. How large can you take m before the error is 100% ? Use
help norm to find out how to calculate the norm in SCILAB. The condition number of
the Hilbert matrix is κ(
Hm) ≈ exp(3.5m). As m varies how
does the number of correct digits in the computed solution relate to the
condition number of the matrix?
Exercise: 5 (
Least Squares)
The last exercise shows how the linear least squares method may be used to solve a real application.
Chemical engineers use tabulated pressure-volume-temperature (pvT) data to analyse the characteristics of various liquids such as ethyl alcohol, refrigerant, and water, which change phases between solid, liquid, and gas. Given such data Clapeyron's equation
may be used to determine the change in enthalpy of saturated water as it is vapourised to steam. We have assumed that the system consists of one kilogram of saturated water. Here ∆h is the change in enthalpy between the liquid and gas phase as the water vapourises. The parameter T denotes the temperature in degrees Kelvin and v
g and v
f are the specific volumes of the gas and liquid, respectively. We will take T = 373.15 K (100 degrees Celsius), v
g = 1.63m
3/kg, and v
f = 0.00104m
3/kg.
To estimate dp/dT we will firstly approximate the pressure p with the cubic polynomial [^p](T) = a + b T/100 + c (T/100)
2 + d(T/100)
3 by using the steam table given below
| Tp (oC) | p (bars) |
| 50 | 0.1235 |
| 60 | 0.1994 |
| 70 | 0.3119 |
| 80 | 0.4739 |
| 90 | 0.7014 |
| 100 | 1.014 |
| 110 | 1.433 |
| 120 | 1.958 |
| 130 | 2.701 |
| 140 | 3.613 |
| 150 | 4.759
|
The table may be coded up in SCILAB by using
/////////////////////////////////////////////////////////////////////
// define the steam table
/////////////////////////////////////////////////////////////////////
Tp = 50:10:150
p = [0.1235, 0.1994, 0.3119, 0.4739, 0.7014, 1.014, 1.433, 1.958, 2.701, 3.613, 4.759]'
- Use the least squares method to show [^p](T) = −1.065 + 4.744 T/100 + − 6.885(T/100)2 + 4.205 (T/100)3. The hard part about this question is setting up the A matrix correctly, ask for help if you are unsure. Once the A has been defined use the qr routine in SCILAB to find the least squares fit.
- Plot the data points and [^p](T).
- Using [^p](T), show that dp/dT at T = 100oC is approximately 3.588×10−2 bars/oC.
- As an alternative to using least squares interpolation may be used instead (i.e define the polynomial [^p](T) such that [^p](Tp) = p(Tp)). Try this by extending the code you wrote above.
- Data measurements usually contain some noise. Try adding a small amount of noise as follows:
/////////////////////////////////////////////////////////////////////
// define the steam table
/////////////////////////////////////////////////////////////////////
T = 50:10:150
p = [0.1235, 0.1994, 0.3119, 0.4739, 0.7014, 1.014, 1.433, 1.958, 2.701, 3.613, 4.759]'
p = p+ rand(p,'normal')*0.1;
Now compare the difference between the polynomial fit and the least squares fit. You should find that the least squares fit looks very similar to the one you obtained previously, on the other hand the polynomial fit will show some large oscillations, particularly near the end-points. This is another example of ill-conditioning; fitting large polynomials to data points is an ill-conditioned problem, hence we see that small changes in the input data result in large changes in the output data.