### General linear methods

### for ordinary differential equations

John Butcher

The University of Auckland, New Zealand

Acknowledgements:

Will Wright, Zdzisław Jackiewicz, Helmut Podhaisky

Solving ordinary differential equations numerically is, even today, still a great challenge.

This applies especially to stiff differential equations and to closely related problems involving algebraic

constraints (DAEs).

Although the problem seems to be solved — there are already highly efficient codes based on Runge–Kutta methods and linear multistep methods — questions concerning methods that lie between the traditional families are still open.

I will talk today about some aspects of these so-called

“General linear methods”.

### Contents

Introduction to stiff problems

Why do we want a new numerical method?

What are General Linear Methods?

What we want from General Linear Methods Propagation of errors

Algebraic analysis of order High order and stage order

General Linear Methods and Runge–Kutta stability Doubly companion matrices

Inherent Runge–Kutta stability Proof of stability result

Example methods

Implementation questions

### Introduction to stiff problems

We will write a standard initial value problem in the form
y^{′}(x) = f(x, y(x)), y(x_{0}) = y_{0}.

The variable x will be referred to as “time” even if it does not represent physical time.

Consider the problem

y_{1}^{′} (x)
y_{2}^{′} (x)
y_{3}^{′} (x)

=

y_{2}(x)

−y_{1}(x)

−Ly_{1}(x) + y_{2}(x) + Ly_{3}(x)

,

with initial values y_{1}(0) = 0, y_{2}(0) = 1, y_{3}(0) = ǫ.

If ǫ = 0, the solution is y(x) =

sin(x) cos(x)

sin(x)

.

Change the value of ǫ and the third component changes to

sin(x) + ǫ exp(Lx).

We will see how this works for values of L = −25 and ǫ = 2.

The solution is plotted for the interval [0, 0.85].

For the approximate solution computed by the Euler

method, we choose n = 16 steps and then decrease n to 10.

0 1 2

0 0.5 x

y_{1}

y_{2}
y_{3}

Exact solution for L = −25

b b b

b b b b b b b b b b b b b

b b b b b b b b b b b

b b

b b

b b

b

b

b

b b

b b b b b b b b b b

0 1 2

0 0.5 x

y_{1}

y_{2}
y_{3}

Approximate solution for L = −25 with n = 16 steps

b b b b b b b b b b b

b b b b b b b b b b b

b

b

b

b

b

b

b

b

b

b

0 1 2

0 0.5 x

y1

y2

y^{3}

L = −25, n = 10

General linear methods for ordinary differential equations – p. 8/46

Note that there seems to be no difficulty approximating
y_{1} and y_{2}, even for quite large stepsizes.

For stepsizes less than h = 0.08, the approximation to y_{3}
converges to the same value as y_{1}, but not in precisely the
same way as for the exact solution.

For stepsizes greater than h = 0.08 the approximations
to y_{3} are hopeless.

We can understand what happens better by studying the
difference y_{3} − y_{1}.

The value of y_{3} − y_{1} satisfies the equation
y^{′}(x) = Ly(x),

with solution const · exp(Lx).

The exact solution to y^{′} = Ly is multiplied by exp(z)
(where z = hL) in each time step but what about the
*computed solution?*

According to the formula for the Euler method,

y_{n} = y_{n−1} + hf(t_{n−1}, y_{n−1}), the numerical solution is
multiplied instead by

1 + z in each time step.

The fact that 1 + z is a poor approximation to exp(z) when z is a very negative number, is at the heart of the difficulty in solving stiff problems.

Sometimes the “implicit Euler method” is used for the solution of stiff problems because in this case the

approximation exp(z) ≈ 1 + z is replaced by
exp(z) ≈ (1 − z)^{−1}.

To see what happens in the case of the contrived problem we have considered, when Euler is replaced by implicit Euler, we present three further figures.

b

b

b

b

b b b

b b b b b b b b b b

b b b b b b b b b b b

b b

b b

b b

b

b

b

b b b b b b b b b b b b b b

0 1 2

0 0.5 x

y_{1}

y_{2}
y_{3}

Solution computed by implicit Euler for L = −25 with n = 16 steps

b

b

b

b

b

b

b

b

b

b

b

b b b b

b b

b b

b

b

b b

b

b

b b

b

b

b

b

b

b

0 1 2

0 0.5 x

y_{1}

y_{2}
y_{3}

Solution computed by implicit Euler for L = −25 with n = 10 steps

b

b

b

b

b

b

b b

b

b

b

b b

b

b

b

b

b

0 1 2

0 0.5 x

y_{1}

y_{2}
y_{3}

Solution computed by implicit Euler for L = −25 with n = 5 steps

It looks as though, even for n as low as 5, we get completely acceptable performance.

The key to the reason for this is that |(1 − z)^{−1}| is
bounded by 1 for all z in the left half-plane.

This means that a purely linear problem, with constant coefficients, will have stable numerical behaviour

whenever the same is true for the exact solution.

Methods with this property are said to be “A-stable”.

Although by no means a guarantee that it will solve all stiff problems adequately, A-stability is an important attribute and a useful certification.

### Why do we want a new numerical method?

We want methods that are efficient for stiff problems Hence we cannot use highly implicit Runge-Kutta methods

Hence we want to avoid non- A-stable BDF methods We want methods that have high stage order

Otherwise order reduction can occur for stiff problems

Otherwise it is difficult to estimate accuracy Otherwise it is difficult to interpolate for dense output

We want methods for which asymptotically correct error estimates can be found

This means having accurate information available in the step to make this possible

High stage order makes this easier

We want methods for which changes to adjacent orders can be considered and evaluated

This means being able to evaluate, for example,
h^{p}^{+2}y^{(}^{p}^{+2)} as well as h^{p}^{+1}y^{(}^{p}^{+1)}

(p denotes the order of the method)

Evaluating the error for lower order alternative methods is usually easy

### What are General Linear Methods?

General linear methods are the natural generalization of Runge–Kutta (multi-stage) methods and linear multistep (multivalue) methods.

This can be illustrated in a diagram.

General Linear Methods

Runge-Kutta Linear Multistep

Euler

We will consider r-value, s-stage methods, where r = 1 for a Runge–Kutta method

and

s = 1 for a linear multistep method.

Each step of the computation takes as input a certain

number (r) of items of data and generates for output the same number of items.

The output items correspond to the input items but advanced through one time step (h).

Within the step a certain number (s) of stages of computation are performed.

Denote by p the order of the method and by q the

“stage-order”.

At the start of step number n, denote the input items by
y_{i}^{[}^{n}^{−1]}, i = 1, 2, . . . , r.

Denote the stages computed in the step and the stage derivatives by Yi and Fi respectively, i = 1, 2, . . . , s. For a compact notation, write

y^{[}^{n}^{−1]} =

y_{1}^{[n}^{−1]}

y_{2}^{[n−1]}

...

yr^{[n}^{−1]}

, y^{[}^{n}^{]} =

y_{1}^{[n]}

y_{2}^{[n]}

...

yr^{[n]}

, Y =

Y_{1}
Y_{2}

...

Y_{s}

, F =

F_{1}
F_{2}

...

F_{s}

The stages are computed by the formulae Yi =

s

X

j=1

aijhFj +

r

X

j=1

uijy_{j}^{[n−1]}, i = 1, 2, . . . , s

and the output approximations by the formulae
y_{i}^{[}^{n}^{]} =

s

X

j=1

b_{ij}hF_{j} +

r

X

j=1

v_{ij}y_{j}^{[}^{n}^{−1]}, i = 1, 2, . . . , r

These can be written together using a partitioned matrix Y

y^{[n]}

=

A U B V

hF
y^{[n−1]}

### What we want from General Linear Methods

Convergence (consistency and stability) Local accuracy

Global accuracy (understand propagation of errors) A-stability for stiff problems

RK stability (behaves like Runge-Kutta method)

### Propagation of errors

Let S denote a “starting method”, that is a mapping from
R^{N} to R^{rN} and a corresponding finishing method

F : R^{rN} → R^{N} such that F ◦ S = id

The order of accuracy of a multivalue method is defined in terms of the diagram

E

S S

M O(h^{p}^{+1})

(h = stepsize)

By duplicating this diagram over many steps, global error estimates may be found

E E E

S S S S S

M M M

O(h^{p})
F

O(h^{p})

### Algebraic analysis of order

We recall the way order is defined for Runge-Kutta methods.

We consider the solution of an autonomous differential
equation y^{′}(x) = f(y(x)).

Let T denote the set of rooted trees

. . . To each of these is associated an “elementary

differential” F (t) defined in terms of the function f.

Also associated with each tree is its symmetry σ(t) and its density γ(t). In the following table, r(t) is the

“order” (number of vertices) of t.

r(t) t σ(t) γ(t) F (t) F (t)^{i}

1 1 1 f f^{i}

2 1 2 f^{′}f f_{j}^{i}f^{j}
3 2 3 f^{′′}(f, f) f_{jk}^{i} f^{j}f^{k}
3 1 6 f^{′}f^{′}f f_{j}^{i}f_{k}^{j}f^{k}
4 6 4 f^{′′′}(f, f, f) f_{jkl}^{i} f^{j}f^{k}f^{l}
4 1 8 f^{′′}(f, f^{′}f) f_{jk}^{i} f^{j}f_{l}^{k}f^{l}
4 2 12 f^{′}f^{′′}(f, f) f_{j}^{i}f_{kl}^{j} f^{k}f^{l}
4 1 24 f^{′}f^{′}f^{′}f f_{j}^{i}f_{k}^{j}f_{l}^{k}f^{l}

We will look at a single example tree in more detail.

t = r(t) = 6

σ(t) = 4

γ(t) = 18

6

1 1 3

1 1

F (t) = f^{′′′}(f,f,f^{′′}(f,f))

′′′

f f f^{′′}

f f

The solution to the standard autonomous initial-value problem

y^{′}(x) = f(y(x)), y(x_{0}) = y_{0}
has the formal Taylor series

y(x_{0} + h) = y_{0} + X

t∈T

h^{r(t)}

σ(t)γ(t)F (t)(y_{0})

A Runge-Kutta method has the same series but with

1/γ(t) replaced by a sequence of “elementary weights”

characteristic of the specific Runge-Kutta method.

Expressions of this type are usually referred to as B-series.

Let G denote the set of mappings T ^{#} → R where
T ^{#} = {∅} ∪ T (and ∅ is the empty tree).

Also write G_{0} and G_{1} as subsets for which ∅ 7→ 0 or

∅ 7→ 1 respectively.

A mapping G_{1} × G → G can be defined which
represents compositions of various operations.

In particular if this is restricted to G_{1} × G_{1} → G_{1} it
becomes a group operation.

Let E denote the mapping t 7→ 1/γ(t) and D the

mapping which takes every tree to zero except the unique order 1 tree, which maps to one.

A general linear method defined from the matrices
[A, U, B, V ] has order p if there exists ξ ∈ G^{r} and
η ∈ G^{s}_{1}, such that

η = AηD + U ξ,

Eξ = BηD + V ξ, (*)

where pre-multiplication by E and post-multiplication by D are scalar times vector products and (*) is to hold only up to trees of order p.

### High order and stage order

If we consider only methods with q = p, then simpler criteria can be found.

Let exp(cz) denote the component-by-component

exponential then order p and stage-order q ≥ p − 1 is equivalent to

exp(cz) = zA exp(cz) + U w(z) + O(z^{q+1}),
exp(z)w(z) = zB exp(cz) + V w(z) + O(z^{p+1}),
where the power series

w(z) = w_{0} + w_{1}z + · · · + w_{p}z^{p}
represents an input approximation

y^{[0]} =

p

X

i=0

wih^{i}y^{(}^{i}^{)}(x_{0}).

### GL Methods and RK stability

A General Linear Method is “Runge–Kutta stable” if its stability matrix

M(z) = V + zB(I − zA)^{−1}U
has only one non-zero eigenvalue.

Armed with this property we should expect attraction to
the manifold S(R^{N}), and stable adherence to this

manifold.

This means that the method starts acting like a one-step method.

### Doubly companion matrices

Matrices like the following are “companion matrices” for the polynomial

z^{n} + α_{1}z^{n}^{−1} + · · · + α_{n}
or

z^{n} + β_{1}z^{n}^{−1} + · · · + βn,
respectively:

−α_{1}−α_{2}−α_{3}· · · −α_{n−1}−α_{n}
1 0 0 · · · 0 0
0 1 0 · · · 0 0
... ... ... ... ...

0 0 0 · · · 0 0 0 0 0 · · · 1 0

,

0 0 0 · · · 0 −β_{n}
1 0 0 · · · 0 −β_{n−1}
0 1 0 · · · 0 −β_{n−2}

... ... ... ... ...

0 0 0 · · · 0 −β_{2}
0 0 0 · · · 1 −β_{1}

Their characteristic polynomials can be found from det(I − zA) = α(z) or β(z), respectively, where,

α(z) = 1+α_{1}z+· · ·+αnz^{n}, β(z) = 1+β_{1}z+· · ·+βnz^{n}.
A matrix with both α and β terms:

X =

−α_{1} −α_{2} −α_{3} · · · −αn−1 −αn − βn

1 0 0 · · · 0 −β_{n−1}

0 1 0 · · · 0 −βn−2

... ... ... ... ...

0 0 0 · · · 0 −β_{2}

0 0 0 · · · 1 −β_{1}

,

is known as a “doubly companion matrix” and has characteristic polynomial defined by

det(I − zX) = α(z)β(z) + O(z^{n+1})

Matrices Ψ^{−1} and Ψ transforming X to Jordan canonical
form are known.

In the special case of a single Jordan block with n-fold eigenvalue λ, we have

Ψ^{−1} =

1 λ + α_{1} λ^{2} + α_{1}λ + α_{2} · · ·
0 1 2λ + α_{1} · · ·

0 0 1 · · ·

... ... ... . . .

,

where row number i + 1 is formed from row number i by differentiating with respect to λ and dividing by i.

We have a similar expression for Ψ:

Ψ =

. . . ... ... ...

· · · 1 2λ + β_{1} λ^{2} + β_{1}λ + β_{2}

· · · 0 1 λ + β_{1}

· · · 0 0 1

The Jordan form is Ψ^{−1}XΨ = J + λI, where Jij =δi,j+1.
That is

Ψ^{−1}XΨ =

λ 0 · · · 0 0 1 λ · · · 0 0 ... ... ... ...

0 0 · · · λ 0 0 0 · · · 1 λ

### Inherent Runge–Kutta stability

Using doubly companion matrices, it is possible to construct GL methods possessing RK stability using

rational operations. The methods constructed in this way are said to possess “Inherent Runge–Kutta Stability”.

Apart from exceptional cases, (in which certain matrices are singular), we characterize the method with

r = s = p + 1 = q + 1 by the parameters λ single eigenvalue of A

c_{1}, c_{2}, . . . , c_{s} stage abscissae
Error constant

β_{1}, β_{2}, . . . , β_{p} elements in last column of s × s
doubly companion matrix X

Information on the structure of V

Consider only methods for which the step n outputs approximate the “Nordsieck vector”:

y_{1}^{[n]}

y_{2}^{[}^{n}^{]}
y_{3}^{[n]}

...

y_{p}^{[}^{n}_{+1}^{]}

≈

y(xn)
hy^{′}(x_{n})
h^{2}y^{′′}(xn)

...

h^{p}y^{(p)}(x_{n})

For such methods, V has the form V =

1 v^{T}
0 V˙

Such a method has the IRKS property if a doubly

companion matrix X exists so that for some vector ξ,
BA = XB, BU = XV − V X + e_{1}ξ^{T} , ρ( ˙V ) = 0
It will be shown that, for such methods, the stability

matrix satisfies

M(z) ∼ V + ze_{1}ξ^{T} (I − zX)^{−1}

which has all except one of its eigenvalues zero. The non-zero eigenvalue has the role of stability function

R(z) = N(z)
(1 − λz)^{s}

### Proof of stability result

(I − zX)M(z)(I − zX)^{−1}

= (V − zXV + z(B − z

XB = BA

XB)(I − zA)^{−1}U)(I − zX)^{−1}

= (V − zXV + zBU)(I − zX)^{−1}

BU = XV − V X + e_{1}ξ^{T}

= V + ze_{1}ξ^{T} (I − zX)^{−1}

This has a single non-zero eigenvalue equal to

R(z) = 1 + zξ^{T} (I − zX)^{−1}e_{1} = det(I + z(e_{1}ξ^{T} − X))
det(I − zX)

### Example methods

The following third order method is explicit and suitable for the solution of non-stiff problems

"

AU BV

#

=

0 0 0 0 1 ^{1}_{4} _{32}^{1} _{384}^{1}

−_{1885}^{176} 0 0 0 1 ^{2237}_{3770} _{15080}^{2237} _{90480}^{2149}

−^{335624}_{311025} ^{29}_{55} 0 0 1 ^{1619591}_{1244100} ^{260027}_{904800} _{39811200}^{1517801}

−^{67843}_{6435} ^{395}_{33} −5 0 1 ^{29428}_{6435} ^{527}_{585} _{102960}^{41819}

−^{67843}_{6435} ^{395}_{33} −5 0 1 ^{29428}_{6435} ^{527}_{585} _{102960}^{41819}

0 0 0 1 0 0 0 0

82

33 −^{274}_{11} ^{170}_{9} −^{4}_{3} 0 ^{482}_{99} 0 −^{161}_{264}

−8 −12 ^{40}_{3} −2 0 ^{26}_{3} 0 0

The following fourth order method is implicit, L-stable, and suitable for the solution of stiff problems

1

4 0 0 0 0 1 ^{3}_{4} ^{1}_{2} ^{1}_{4} 0

−_{54272}^{513} ^{1}_{4} 0 0 0 1 ^{27649}_{54272} _{27136}^{5601} _{54272}^{1539} −_{6784}^{459}

3706119

69088256 −_{3819}^{488} ^{1}_{4} 0 0 1 _{207264768}^{15366379} _{34544128}^{756057} _{69088256}^{1620299} −_{454528}^{4854}

32161061

197549232 −^{111814}_{232959} ^{134}_{183} ^{1}_{4} 0 1−_{197549232}^{32609017} _{32924872}^{929753} _{32924872}^{4008881} _{3465776}^{174981}

−_{2948496}^{135425} −_{10431}^{641} _{183}^{73} ^{1}_{2} ^{1}_{4} 1 −_{8845488}^{367313} −_{1474248}^{22727} _{982832}^{40979} _{25864}^{323}

−_{2948496}^{135425} −_{10431}^{641} _{183}^{73} ^{1}_{2} ^{1}_{4} 1 −_{8845488}^{367313} −_{1474248}^{22727} _{982832}^{40979} _{25864}^{323}

0 0 0 0 1 0 0 0 0 0

2255

2318 −^{47125}_{20862} ^{447}_{122} −^{11}_{4} ^{4}_{3} 0 −^{28745}_{20862} −_{13908}^{1937} _{18544}^{351} _{976}^{65}

12620

10431 −^{96388}_{31293} ^{3364}_{549} −^{10}_{3} ^{4}_{3} 0 −^{70634}_{31293} −_{10431}^{2050} −_{2318}^{187} ^{113}_{366}

414

1159 −^{29954}_{31293} ^{130}_{61} −1 ^{1}_{3} 0 −^{27052}_{31293} −_{10431}^{113} −_{4636}^{491} ^{161}_{732}

### Implementation questions

Many implementation questions are similar to those for traditional methods but there are some new challenges.

We want variable order and stepsize and it is even a realistic aim to change between stiff and non-stiff methods automatically.

Because of the variable order and stepsize aims, we wish to be able to do the following:

Estimate the local truncation error of the current step Estimate the local truncation error of an alternative method of higher order

Change the stepsize with little cost and with little impact on stability

My final comment is that all these things are possible and many of the details are already known and understood.

*What is not known is how to choose between different*
choices of the free parameters.

Identifying the best methods is the first step in the

construction of a competitive stiff and non-stiff solver.

### References

The following recent publications contain references to earlier work:

*Numerical Methods for Ordinary Differential*
*Equations, J. C. Butcher, J. Wiley, Chichester,*
(2003)

*General linear methods, J. C. Butcher, Acta*
**Numerica 15 (2006), 157–256.**

Gracias Thank you

Obrigado