Skip Navigation | ANU Home | Search ANU
The Australian
National University
Computational Science Education Outreach and Training (EOT)
Printer Friendly Version of this Document
Solving Poisson's Equation using the CG Method
APAC-ANU Teaching Module



In this tutorial we demonstrate the usefulness of iterative solvers when trying to solve problems associated with partial differential equations. It might be useful to check out this notes as you attempt the tutorial.
We use SCILAB, a freely available scientific computing environment like MATLAB. Check out our Online SCILAB Tutorials for information on using (and setting up) SCILAB.
All of the questions below focus on the solution of Poisson's equation on the unit square with Dirichlet boundary conditions. In particular we shall look at −∆u = f where the exact solution is u = exp(x)exp(y). If the domain is subdivided into a uniform grid of size h = 1/(n+1) then the matrix resulting from the finite difference or finite element discretisation is as defined in the following piece of SCILAB code. Copy the code and save it in a file called matrix.sci.
function[A] = build_matrix(n)

        A = zeros(n*n, n*n)

        // diagonal

        A = A+diag(4*ones(diag(A)))

        // entry above diagonal

        d = -ones(diag(A, 1))
        d(n:n:n^2-n) = zeros( d(n:n:n^2-n))
        A = A+diag(d, 1)

        // entry below diagonal

        A = A+diag(d, -1)

        // second super and sub diagaonals

        A = A+diag(-ones(diag(A, n)), n)
        A = A+diag(-ones(diag(A, -n)), -n)

endfunction;


Exercise: 1 (Gaussian Elimination)
The main aim of this question is to convince you not to use Gaussian elimination to solve system of equations arising from the discretisation of partial differential equations. There are far better methods available and we will look at two of them in the following questions.
Below is a copy of the Gaussian Elimination routines given in the previous tutorial. You should copy the code into a file called GE.sci
/////////////////////////////////////////////////////////////////////
// Use a sequence of elementary matrix operations to convert the system
// Ax =b into Ux = b'
/////////////////////////////////////////////////////////////////////

function[U, be] = GE(A, b)

    epsilon_tol = %eps*10;

    // check that the matrix sizes make sense

    [nr, nc] = size(A);
    [br, bc] = size(b);

    if nr ~= nc then
      error('A must be square')
    end;
    if br ~= nr then
      error ('Size of A and b are not consistent')
    end;

    // initialise the upper triangular matrix

    U = A; be = b;

    // find the solution vector using the forward substitution algorithm

    for j = 1:nc
      if abs(U(j, j)) < epsilon_tol then
         error('A has a zero pivot');
      end;
      for i = j+1:nc
          m = U(i, j)/U(j, j)
          U(i,j:nc) = U(i, j:nc)-m*U(j, j:nc)
          be(i) = be(i)-m*be(j)
      end;
    end
endfunction;

/////////////////////////////////////////////////////////////////////
// Assuming that A is an upper triangular matrix, solve Ax = b using
// the back substitution algorithm
/////////////////////////////////////////////////////////////////////

function[x] = backsub(A, b)

    epsilon_tol = %eps*10;

    // check that the matrix sizes make sense

    [nr, nc] = size(A);
    [br, bc] = size(b);

    if nr ~= nc then
      error('A must be square')
    end;
    if br ~= nr then
      error ('Size of A and b are not consistent')
    end;

    // initialise the solution vector

    x = zeros(nc, 1);

    // find the last element of the solution vector

    if abs(A(nc, nc)) < epsilon_tol then
       error('A is singular');
    end;
    x(nc) = b(nc)/A(nc, nc);

    // find the solution vector using the back substitution algorithm

    for i = nr-1:-1:1
      if abs(A(i, i)) < epsilon_tol then
         error('A is singular');
      end;
      x(i) = (b(i)-A(i, i+1:nc)*x(i+1:nc))/A(i, i)
    end
endfunction;

Let A be the matrix corresponding to the finite element discretisation of Poisson's equation. Write a routine that will solve Ax = b, for a given n, using the Gaussian elimination routines listed above. The model problem had non-zero boundary conditions so we have to be careful about how they are included in the system of equations; below is one way of doing it:
        // exact solution

        x = matrix(exp((1:n)'/(n+1))*exp((1:n)/(n+1)), n*n, 1);

        // rhs

        h = 1.0/(n+1.0);
        b = -2*h^2*x;

        // adjust rhs for the boundary conditions

        b(1:n) = b(1:n)+exp((1:n)/(n+1))';
        b((n-1)*n+1:n*n) = b((n-1)*n+1:n*n)+exp(1.0)*exp((1:n)/(n+1))';
        b(1:n:(n-1)*n+1) = b(1:n:(n-1)*n+1)+exp((1:n)/(n+1))';
        b(n:n:n*n) = b(n:n:n*n)+exp(1.0)*exp((1:n)/(n+1))';

  • The first step is to make sure that the system of equations has been defined properly. Solve the system Ax = b for different size n (5 ≤ n ≤ 20), calculate the discretisation error and plot the results. Theoretically the error should be O(h2), check that is the case.
  • Gaussian elimination is an O(m3) algorithm, where A is an m ×m matrix (m = n2). This means that if you double the size of m your would expect the solution time to increase by a factor of 8. Using the SCILAB timer routine time how long it takes to solve the system Ax = b for different size n (5 ≤ n ≤ 20), plot the results and verify that the solution time is O(m3).
  • Another issue with Gaussian elimination is fill-in. The A matrix is sparse, has a lot of zeros, and thus does not require a lot of memory if special storage formats are used (which we will use in the next question). The Gaussian elimination procedure however fills in many of those zero entries with non-zero values and therefore increases the memory requirements. Repeat the above experiment but this time measure the number of non-zeros in the upper triangular matrix U relative to the number of non-zeros in A. The following code may be used to measure the number of non-zeros in a matrix:
    function[nz] = count_nonzero(A)
            B = ones(A)
            C = B(A <> 0.0)
            nz = sum(C)
    endfunction;
    
    
    There are direct methods available that try to deal with sparse matrices, however, in general it is much quicker and simpler to use iterative methods when dealing with matrices arising from the discretisation of partial differential equations.

Exercise: 2 (Conjugate Gradient)
We will now look an iterative scheme, namely the conjugate gradient method.
The first point to observe is that we can use sparse storage format to greatly reduce the amount of space we need. Replace the build_matrix function defined above with the following sparse version:
function[A] = build_matrix_sparse(n)

        // diagonal entry

        index = zeros(n*n, 2)
        index(:,1) = (1:n*n)'
        index(:,2) = index(:, 1)
        A = sparse(index, 4*ones(1:n*n), [n*n, n*n])

        // entry above diagonal

        index = zeros(n*n-1, 2)
        index(:,1) = (1:n*n-1)'
        index(:,2) = (2:n*n)'
        d = -ones(n*n-1, 1)
        d(n:n:n^2-n) = zeros( d(n:n:n^2-n))
        A = A+sparse(index, d, [n*n, n*n])

        // entry below diagonal

        index(:,1) = (2:n*n)'
        index(:,2) = (1:n*n-1)'
        A = A+sparse(index, d, [n*n, n*n])

        // second super diagonal

        index = zeros(n*(n-1), 2)
        index(:,1) = (1:n*(n-1))'
        index(:,2) = (n+1:n*n)'
        A = A+sparse(index, -ones(n*(n-1), 1), [n*n, n*n])

        // second sub diagonal

        index = zeros(n*(n-1), 2)
        index(:,1) = (n+1:n*n)'
        index(:,2) = (1:n*(n-1))'
        A = A+sparse(index, -ones(n*(n-1), 1), [n*n, n*n])

endfunction;

Sparse storage format is clearly more messy (less intuitive) than the standard format, which is one of the arguments against iterative methods, but the improved efficiency in time and memory makes up for it. Now we need a routine to implement the conjugate gradient method:

///////////////////////////////////////////////////////////////////////////
// Preconditioned Conjugate-Gradient solver
// Original Matlab code was written by C.T. Kelley and modified by
// Linda Stals for use with SCILAB
// Original documentaion is given below
//////////////////////////////////////////////////////////////////////////
// Preconditioned Conjugate-Gradient solver
//
// C. T. Kelley, December 12, 1993
//
// This code comes with no guarantee or warranty of any kind.
//
// function [x, error, total_iters, it_hist]
//                    = pcgsol(x0, b, atv, params, pcv)
//
//
// Input:        x0=initial iterate
//               b=right hand side
//               atv, a matrix-vector product routine
//          atv must return Ax when x is input
//          the format for atv is
//                       function ax = atv(x)
//               params = two dimensional vector to control iteration
//                        params(1) = relative residual reduction factor
//                        params(2) = max number of iterations
//               pcv, a routine to apply the preconditioner
//                       if omitted, the identity is used.
//                       The format for pcv is
//                       function px = pcv(x).
//
// Output:       x=solution
//               error = vector of iteration residual norms
//               total_iters = number of iterations
//               it_hist (optional) = matrix of all iterations
//          useful for movies
//
//
//
////////////////////////////////////////////////////////////////////////

function [x, resid, total_iters] = pcgsol(x0, b, A, params, pcv)

    // check that the matrix sizes make sense

    [nr, nc] = size(A);
    [br, bc] = size(b);
    [xr, xc] = size(x0);

    if nr ~= nc then
      error('A must be square')
    end;

    if br ~= nr then
      error ('Size of A and b are not consistent')
    end;

    if xr ~= nc then
      error ('Size of A and x are not consistent')
    end;

    // initialization

     errtol = params(1); maxiters = params(2);
     resid=0; x=x0;

     r=b - A*x;
     z = feval(r, pcv);

     rho=z'*r;

     tst=norm(r);
     terminate=errtol*norm(b);
     resid=[resid,tst];

     it=1;
     while((tst > terminate) & (it <= maxiters))

        if(it==1)
           p = z;
        else
           beta=rho/rhoold;
           p = z + beta*p;
        end;

        w = A*p;
        alpha=p'*w;

        // Test here to make sure the linear transformation is positive definite.
        // A non-positive value of alpha is a very bad sign.

        if(alpha <= 0)
            [alpha, rho, it]
            error(' negative curvature ')
        end
        alpha=rho/alpha;
        x=x+alpha*p;

        r = r - alpha*w;
        tst=norm(r);
        rhoold=rho;

        z = feval(r, pcv);

        rho=z'*r;
        it=it+1;
        resid=[resid,tst];
     end;

total_iters=it-1;
endfunction;

Save the code in a file called pcg.sci.
The initial guess is passed in as x0, b is the right hand side of the equation and A is the A matrix. The conjugate gradient method needs a stopping criteria, which is defined by params where params(1) is the tolerance ε and params(2) is the maximum number of iterations. The method iterates until either the maximum number of iterations has been reached or ||ri||2 ≤ ε||r0||2 (ri = bAxi). This stopping criteria is common, but it should be used with caution; as we saw before, a small residual does not necessarily imply a small error. It may be worthwhile researching this topic further to find out what is a good stopping criteria for your application area. The parameter pcv is used to define the preconditioner, we will not be using a preconditioner in this question.
In the output, x is the approximation to the solution, resid gives a list of the residuals ri and total_iters is the total number of iterations.
The following section of code gives an example of how to call the conjugate gradient routine if n = 10. The file matrix.sci contains the build_matrix_sparse routine. Run the code and have a look at the output.

/////////////////////////////////////////////////////////////////////
// get the user defined functions
/////////////////////////////////////////////////////////////////////

getf('matrix.sci')
getf('pcg.sci')

/////////////////////////////////////////////////////////////////////
// the preconditioner is just the identity matrix
/////////////////////////////////////////////////////////////////////

deff('[z]=precond(x)','z=x');

/////////////////////////////////////////////////////////////////////
// iterate until the residual has been reduced by a factor of 0.00001
// and allow a maximum of 100 iterations
/////////////////////////////////////////////////////////////////////

params = [0.00001, 100];


/////////////////////////////////////////////////////////////////////
// build the A matrix
/////////////////////////////////////////////////////////////////////

n = 10;
A = build_matrix_sparse(n);

/////////////////////////////////////////////////////////////////////
// define the system of equations
/////////////////////////////////////////////////////////////////////

x = ones(n*n,1);
b = A*x;
x0 = zeros(x);

/////////////////////////////////////////////////////////////////////
// solve the system
/////////////////////////////////////////////////////////////////////

[y, resid, total_iters] = pcgsol(x0, b, A, params, precond);

  • Modify the above code so it solves the model problem for 5 ≤ n ≤ 60 and plot the discretisation error. Are you getting the expected O(h2) convergence rate? If not, you may need to modify params to make ε smaller. For n = 60 the code is a bit slow so you may want to test/debug it first with a smaller upper bound.
  • Now time how long it takes to solve problems of size 5 ≤ n ≤ 60. You will notice that it is considerably faster than Gaussian elimination. The conjugate gradient routine uses a collection of matrix-vector multiplications and inner products, which are O(m) operations when taking the sparsity into account. Are the timing results O(m)? Keep a record of how long it took to solve the problem for n = 60 (this will be needed for Exercise 4).
  • Rerun the above experiment, but this time keep a record of the total number of iterations (total_iters) required to solve a problem of size n, 5 ≤ n ≤ 60, and plot the results. The number of iterations will increase as n is increased because the convergence rate of the conjugate gradient method depends on the condition number of A; and the condition number of A is O(1/h2). In Exercise 4 we look at using preconditioners to fix the problem.

Exercise: 3 (Multigrid)
A collection of free multigrid codes is available at the MGNET website http://www.mgnet.org.
The following is a multigrid code for Poisson's equation on a square domain. It uses the weighted Jacobi method as a smoother. Note that the code does not explicitly define the A matrix, rather it works on the finite difference grid. It assumes that the grid contains (2q+1) ×(2q+1) nodes (for some integer q) to make it easier to define the coarse grids. I saved the code in a file called mg.sci.

///////////////////////////////////////////////////////////////////////////
// Multigrid Poisson equation solver
// Original Matlab code was written by James Demmel and modified by
// Linda Stals, 2004, for use with SCILAB
// Original documentaion is given below
// http://www.cs.berkeley.edu/~demmel/ma221/Matlab/mgv.m
///////////////////////////////////////////////////////////////////////////
// Matlab code mgv.m
// For "Applied Numerical Linear Algebra",  Question 6.16
// Written by James Demmel, Jul 10, 1993
//                Modified, Jun  2, 1997
//
// Multigrid V-cycle for Poisson's equation on a square grid
// with zero Dirichlet boundary conditions.
// Algorithm from Brigg's ``Multigrid Tutorial''
// Include zero boundary values in arrays for ease of programming.
// Assume dimension n = 2^k + 1

// Inputs:
//   x = initial guess (n by n matrix with zeros on boundary)
//   b = right hand side (n by n matrix)
//   jac1, jac2 = number of weighted Jacobi steps to do before and
//                after recursive call to mgv
// Outputs:
//   z = improved solution (n by n matrix with zeros on boundary)
//
// function z=mgv(x,b,jac1,jac2)
//
////////////////////////////////////////////////////////////////////////

function z=mgv(x,b,jac1,jac2)

  // find size of matrix problem

  [n,m]=size(b);

  //  Solve 1 by 1 problem explicitly

  if n == 3 then
    z=zeros(3,3);
    z(2,2)=b(2,2)/4;

  // else apply multigrid algorithm

  else

    //  Perform jac1 steps of weighted Jacobi with weight w

    w=2/3;
    for i=1:jac1
      x_sum =  x(1:n-2,2:n-1) + x(3:n,2:n-1);
      x_sum = x_sum +  x(2:n-1,1:n-2) + x(2:n-1,3:n);
      x(2:n-1,2:n-1)=(1-w)*x(2:n-1,2:n-1) + (w/4)*(x_sum+b(2:n-1,2:n-1));
    end;

    //  Compute residual on current grid

    r=zeros(n,n);
    Ax = 4*x(2:n-1,2:n-1);
    Ax = Ax - x(1:n-2,2:n-1) - x(3:n,2:n-1) - x(2:n-1,1:n-2) - x(2:n-1,3:n);
    r(2:n-1,2:n-1) = b(2:n-1,2:n-1) -Ax;

    //  Restrict residual to coarse grid

    m=(n+1)/2;
    rhat=zeros(m,m);
    centre_pt = r(3:2:n-2,3:2:n-2);
    side_pt = r(2:2:n-3,3:2:n-2)+r(4:2:n-1,3:2:n-2)+r(3:2:n-2,2:2:n-3)+r(3:2:n-2,4:2:n-1);
    diag_pt = r(2:2:n-3,2:2:n-3)+r(4:2:n-1,2:2:n-3)+r(2:2:n-3,4:2:n-1)+r(4:2:n-1,4:2:n-1);

    rhat(2:m-1,2:m-1)=.25*centre_pt + .125*side_pt + 0.0625*diag_pt;

    //  Solve recursively with zero initial guess
    //  Multiply residual by 4 for coarser grid

    xhat = mgv(zeros(rhat),4*rhat,jac1,jac2);

    //  Interpolate coarse solution to fine grid, and add to x

    xcor=zeros(x);
    xcor(3:2:n-2,3:2:n-2)=xhat(2:m-1,2:m-1);
    xcor(2:2:n-1,3:2:n-2)=.5*(xhat(1:m-1,2:m-1)+xhat(2:m,2:m-1));
    xcor(3:2:n-2,2:2:n-1)=.5*(xhat(2:m-1,1:m-1)+xhat(2:m-1,2:m));
    xcor(2:2:n-1,2:2:n-1)=.25*(xhat(1:m-1,1:m-1)+xhat(1:m-1,2:m)+xhat(2:m,1:m-1)+xhat(2:m,2:m));

    z=x+xcor;

    //  Perform jac2 steps of weighted Jacobi with weight w

    for i=1:jac2
      z_sum =  z(1:n-2,2:n-1) + z(3:n,2:n-1);
      z_sum = z_sum +  z(2:n-1,1:n-2) + z(2:n-1,3:n);
      z(2:n-1,2:n-1)=(1-w)*z(2:n-1,2:n-1) + (w/4)*(z_sum+b(2:n-1,2:n-1));
    end;
  end;
endfunction;

The following example shows how to call the multigrid routine
/////////////////////////////////////////////////////////////////////
// get the user defined functions
/////////////////////////////////////////////////////////////////////

getf('mg.sci')


/////////////////////////////////////////////////////////////////////
// iterate until the residual has been reduced by a factor of 0.00001
// and allow a maximum of 100 iterations
/////////////////////////////////////////////////////////////////////

jac1 = 6;
jac2 = 6;
n_mg = 2;

/////////////////////////////////////////////////////////////////////
// define the system of equations
/////////////////////////////////////////////////////////////////////
p = 3;
n = 2^p-1;
y0 = zeros(n+2,n+2);
h = 1.0/(n+1);
x = sin(2*%pi*(0:n+1)'/(n+1))*sin(2*%pi*(0:n+1)/(n+1));
b = 8*%pi^2*h^2*x;

/////////////////////////////////////////////////////////////////////
// solve the system and keep track of the time (stored in t) and
// number of iterations required to solve the problem (stored in s)
/////////////////////////////////////////////////////////////////////
y = y0
for i = 1:n_mg
  y = mgv(y, b, jac1, jac2);
end;

  • Modify the above code so that it solves the model problem for varying size p (1 ≤ p ≤ 7).
  • Run the code and calculate the discretisation error for each value of p. Do you see the expected O(h2) discretisation error? You may need to adjust the n_mg parameter.
  • Rerun the code and time the results. Theoretically you should see that the timing is O(m); i.e.. it is optimal. In practise though the SCILAB timing routine is not particularly accurate and it is a little hard to verify the timing results. However it is clear that this method beats both Gaussian elimination and the conjugate gradient method.
Exercise: 4 (Preconditioned Conjugate Gradient)
In the final exercise we look at the use of the preconditioned conjugate gradient method.
  • Define a Jacobi preconditioner as follows
    /////////////////////////////////////////////////////////////////////
    // Define a Jacobi preconditioner. Note that we map the x vector onto
    // the 2D grid to define the Jacobi preconditioner. This means that we
    // define a mapping between two different sparse grid storage formats,
    // for larger applications we should try to avoid this
    /////////////////////////////////////////////////////////////////////
    function z=jacobi(b,it)
    
        // get the grid size
    
        [nr, nc] = size(b);
        m = sqrt(nr); n = m+2;
    
        // map onto the grid (including the boundary layer)
    
        f = zeros(m+2, m+2)
        f(2:m+1, 2:m+1) = matrix(b,m ,m)
        x = zeros(f);
    
        // apply the jacobi algorithm
    
        w = 2/3
        for i=1:it
          x_sum =  x(1:n-2,2:n-1) + x(3:n,2:n-1);
          x_sum = x_sum +  x(2:n-1,1:n-2) + x(2:n-1,3:n);
          x(2:n-1,2:n-1)=(1-w)*x(2:n-1,2:n-1) + (w/4)*(x_sum+f(2:n-1,2:n-1));
        end;
    
        // convert the storage back into the original format
    
        z = matrix(x(2:m+1, 2:m+1), nr, nc);
    endfunction;
    
    
    To pass the preconditioner into the pcgsol routine use
    deff('[z]=precond(x)','z=jacobi(x, 3)');
    
    
    Now rerun the experiment described in Exercise 2 and compare the total number of iterations to the case where there was no preconditioner. You should find that using the Jacobi preconditioner has roughly halved the total number of iterations for n = 60. Each iteration of the conjugate gradient method is now more expensive, but the reduction in the total number of iterations means that we get an overall reduction in time.
  • Write a preconditioner that uses the multigrid method and repeat the above experiment. This time you should see that the total number of conjugate gradient iterations does not increase as n is increased. This is the ideal case.
    For an example like Poisson's equation using the multigrid method as a preconditioner is over-kill, it would be much quicker just to call the multigrid method itself. However, for more complex problems you need to choose the correct interpolation and restriction operators to get the optimal performance out of multigrid, and this is not always easy to do. In such cases it is recommend to use a simple multigrid algorithm, similar to one described above, as a preconditioner in combination with a Krylov subspace method such as the conjugate gradient method.