General linear methods
for ordinary differential equations
John Butcher
The University of Auckland, New Zealand
Acknowledgements:
Will Wright, Zdzisław Jackiewicz, Helmut Podhaisky
Solving ordinary differential equations numerically is, even today, still a great challenge.
This applies especially to stiff differential equations and to closely related problems involving algebraic
constraints (DAEs).
Although the problem seems to be solved — there are already highly efficient codes based on Runge–Kutta methods and linear multistep methods — questions concerning methods that lie between the traditional families are still open.
I will talk today about some aspects of these so-called
“General linear methods”.
Contents
Introduction to stiff problems
Why do we want a new numerical method?
What are General Linear Methods?
What we want from General Linear Methods Propagation of errors
Algebraic analysis of order High order and stage order
General Linear Methods and Runge–Kutta stability Doubly companion matrices
Inherent Runge–Kutta stability Proof of stability result
Example methods
Implementation questions
Introduction to stiff problems
We will write a standard initial value problem in the form y′(x) = f(x, y(x)), y(x0) = y0.
The variable x will be referred to as “time” even if it does not represent physical time.
Consider the problem
y1′ (x) y2′ (x) y3′ (x)
=
y2(x)
−y1(x)
−Ly1(x) + y2(x) + Ly3(x)
,
with initial values y1(0) = 0, y2(0) = 1, y3(0) = ǫ.
If ǫ = 0, the solution is y(x) =
sin(x) cos(x)
sin(x)
.
Change the value of ǫ and the third component changes to
sin(x) + ǫ exp(Lx).
We will see how this works for values of L = −25 and ǫ = 2.
The solution is plotted for the interval [0, 0.85].
For the approximate solution computed by the Euler
method, we choose n = 16 steps and then decrease n to 10.
0 1 2
0 0.5 x
y1
y2 y3
Exact solution for L = −25
b b b
b b b b b b b b b b b b b
b b b b b b b b b b b
b b
b b
b b
b
b
b
b b
b b b b b b b b b b
0 1 2
0 0.5 x
y1
y2 y3
Approximate solution for L = −25 with n = 16 steps
b b b b b b b b b b b
b b b b b b b b b b b
b
b
b
b
b
b
b
b
b
b
0 1 2
0 0.5 x
y1
y2
y3
L = −25, n = 10
General linear methods for ordinary differential equations – p. 8/46
Note that there seems to be no difficulty approximating y1 and y2, even for quite large stepsizes.
For stepsizes less than h = 0.08, the approximation to y3 converges to the same value as y1, but not in precisely the same way as for the exact solution.
For stepsizes greater than h = 0.08 the approximations to y3 are hopeless.
We can understand what happens better by studying the difference y3 − y1.
The value of y3 − y1 satisfies the equation y′(x) = Ly(x),
with solution const · exp(Lx).
The exact solution to y′ = Ly is multiplied by exp(z) (where z = hL) in each time step but what about the computed solution?
According to the formula for the Euler method,
yn = yn−1 + hf(tn−1, yn−1), the numerical solution is multiplied instead by
1 + z in each time step.
The fact that 1 + z is a poor approximation to exp(z) when z is a very negative number, is at the heart of the difficulty in solving stiff problems.
Sometimes the “implicit Euler method” is used for the solution of stiff problems because in this case the
approximation exp(z) ≈ 1 + z is replaced by exp(z) ≈ (1 − z)−1.
To see what happens in the case of the contrived problem we have considered, when Euler is replaced by implicit Euler, we present three further figures.
b
b
b
b
b b b
b b b b b b b b b b
b b b b b b b b b b b
b b
b b
b b
b
b
b
b b b b b b b b b b b b b b
0 1 2
0 0.5 x
y1
y2 y3
Solution computed by implicit Euler for L = −25 with n = 16 steps
b
b
b
b
b
b
b
b
b
b
b
b b b b
b b
b b
b
b
b b
b
b
b b
b
b
b
b
b
b
0 1 2
0 0.5 x
y1
y2 y3
Solution computed by implicit Euler for L = −25 with n = 10 steps
b
b
b
b
b
b
b b
b
b
b
b b
b
b
b
b
b
0 1 2
0 0.5 x
y1
y2 y3
Solution computed by implicit Euler for L = −25 with n = 5 steps
It looks as though, even for n as low as 5, we get completely acceptable performance.
The key to the reason for this is that |(1 − z)−1| is bounded by 1 for all z in the left half-plane.
This means that a purely linear problem, with constant coefficients, will have stable numerical behaviour
whenever the same is true for the exact solution.
Methods with this property are said to be “A-stable”.
Although by no means a guarantee that it will solve all stiff problems adequately, A-stability is an important attribute and a useful certification.
Why do we want a new numerical method?
We want methods that are efficient for stiff problems Hence we cannot use highly implicit Runge-Kutta methods
Hence we want to avoid non- A-stable BDF methods We want methods that have high stage order
Otherwise order reduction can occur for stiff problems
Otherwise it is difficult to estimate accuracy Otherwise it is difficult to interpolate for dense output
We want methods for which asymptotically correct error estimates can be found
This means having accurate information available in the step to make this possible
High stage order makes this easier
We want methods for which changes to adjacent orders can be considered and evaluated
This means being able to evaluate, for example, hp+2y(p+2) as well as hp+1y(p+1)
(p denotes the order of the method)
Evaluating the error for lower order alternative methods is usually easy
What are General Linear Methods?
General linear methods are the natural generalization of Runge–Kutta (multi-stage) methods and linear multistep (multivalue) methods.
This can be illustrated in a diagram.
General Linear Methods
Runge-Kutta Linear Multistep
Euler
We will consider r-value, s-stage methods, where r = 1 for a Runge–Kutta method
and
s = 1 for a linear multistep method.
Each step of the computation takes as input a certain
number (r) of items of data and generates for output the same number of items.
The output items correspond to the input items but advanced through one time step (h).
Within the step a certain number (s) of stages of computation are performed.
Denote by p the order of the method and by q the
“stage-order”.
At the start of step number n, denote the input items by yi[n−1], i = 1, 2, . . . , r.
Denote the stages computed in the step and the stage derivatives by Yi and Fi respectively, i = 1, 2, . . . , s. For a compact notation, write
y[n−1] =
y1[n−1]
y2[n−1]
...
yr[n−1]
, y[n] =
y1[n]
y2[n]
...
yr[n]
, Y =
Y1 Y2
...
Ys
, F =
F1 F2
...
Fs
The stages are computed by the formulae Yi =
s
X
j=1
aijhFj +
r
X
j=1
uijyj[n−1], i = 1, 2, . . . , s
and the output approximations by the formulae yi[n] =
s
X
j=1
bijhFj +
r
X
j=1
vijyj[n−1], i = 1, 2, . . . , r
These can be written together using a partitioned matrix Y
y[n]
=
A U B V
hF y[n−1]
What we want from General Linear Methods
Convergence (consistency and stability) Local accuracy
Global accuracy (understand propagation of errors) A-stability for stiff problems
RK stability (behaves like Runge-Kutta method)
Propagation of errors
Let S denote a “starting method”, that is a mapping from RN to RrN and a corresponding finishing method
F : RrN → RN such that F ◦ S = id
The order of accuracy of a multivalue method is defined in terms of the diagram
E
S S
M O(hp+1)
(h = stepsize)
By duplicating this diagram over many steps, global error estimates may be found
E E E
S S S S S
M M M
O(hp) F
O(hp)
Algebraic analysis of order
We recall the way order is defined for Runge-Kutta methods.
We consider the solution of an autonomous differential equation y′(x) = f(y(x)).
Let T denote the set of rooted trees
. . . To each of these is associated an “elementary
differential” F (t) defined in terms of the function f.
Also associated with each tree is its symmetry σ(t) and its density γ(t). In the following table, r(t) is the
“order” (number of vertices) of t.
r(t) t σ(t) γ(t) F (t) F (t)i
1 1 1 f fi
2 1 2 f′f fjifj 3 2 3 f′′(f, f) fjki fjfk 3 1 6 f′f′f fjifkjfk 4 6 4 f′′′(f, f, f) fjkli fjfkfl 4 1 8 f′′(f, f′f) fjki fjflkfl 4 2 12 f′f′′(f, f) fjifklj fkfl 4 1 24 f′f′f′f fjifkjflkfl
We will look at a single example tree in more detail.
t = r(t) = 6
σ(t) = 4
γ(t) = 18
6
1 1 3
1 1
F (t) = f′′′(f,f,f′′(f,f))
′′′
f f f′′
f f
The solution to the standard autonomous initial-value problem
y′(x) = f(y(x)), y(x0) = y0 has the formal Taylor series
y(x0 + h) = y0 + X
t∈T
hr(t)
σ(t)γ(t)F (t)(y0)
A Runge-Kutta method has the same series but with
1/γ(t) replaced by a sequence of “elementary weights”
characteristic of the specific Runge-Kutta method.
Expressions of this type are usually referred to as B-series.
Let G denote the set of mappings T # → R where T # = {∅} ∪ T (and ∅ is the empty tree).
Also write G0 and G1 as subsets for which ∅ 7→ 0 or
∅ 7→ 1 respectively.
A mapping G1 × G → G can be defined which represents compositions of various operations.
In particular if this is restricted to G1 × G1 → G1 it becomes a group operation.
Let E denote the mapping t 7→ 1/γ(t) and D the
mapping which takes every tree to zero except the unique order 1 tree, which maps to one.
A general linear method defined from the matrices [A, U, B, V ] has order p if there exists ξ ∈ Gr and η ∈ Gs1, such that
η = AηD + U ξ,
Eξ = BηD + V ξ, (*)
where pre-multiplication by E and post-multiplication by D are scalar times vector products and (*) is to hold only up to trees of order p.
High order and stage order
If we consider only methods with q = p, then simpler criteria can be found.
Let exp(cz) denote the component-by-component
exponential then order p and stage-order q ≥ p − 1 is equivalent to
exp(cz) = zA exp(cz) + U w(z) + O(zq+1), exp(z)w(z) = zB exp(cz) + V w(z) + O(zp+1), where the power series
w(z) = w0 + w1z + · · · + wpzp represents an input approximation
y[0] =
p
X
i=0
wihiy(i)(x0).
GL Methods and RK stability
A General Linear Method is “Runge–Kutta stable” if its stability matrix
M(z) = V + zB(I − zA)−1U has only one non-zero eigenvalue.
Armed with this property we should expect attraction to the manifold S(RN), and stable adherence to this
manifold.
This means that the method starts acting like a one-step method.
Doubly companion matrices
Matrices like the following are “companion matrices” for the polynomial
zn + α1zn−1 + · · · + αn or
zn + β1zn−1 + · · · + βn, respectively:
−α1−α2−α3· · · −αn−1−αn 1 0 0 · · · 0 0 0 1 0 · · · 0 0 ... ... ... ... ...
0 0 0 · · · 0 0 0 0 0 · · · 1 0
,
0 0 0 · · · 0 −βn 1 0 0 · · · 0 −βn−1 0 1 0 · · · 0 −βn−2
... ... ... ... ...
0 0 0 · · · 0 −β2 0 0 0 · · · 1 −β1
Their characteristic polynomials can be found from det(I − zA) = α(z) or β(z), respectively, where,
α(z) = 1+α1z+· · ·+αnzn, β(z) = 1+β1z+· · ·+βnzn. A matrix with both α and β terms:
X =
−α1 −α2 −α3 · · · −αn−1 −αn − βn
1 0 0 · · · 0 −βn−1
0 1 0 · · · 0 −βn−2
... ... ... ... ...
0 0 0 · · · 0 −β2
0 0 0 · · · 1 −β1
,
is known as a “doubly companion matrix” and has characteristic polynomial defined by
det(I − zX) = α(z)β(z) + O(zn+1)
Matrices Ψ−1 and Ψ transforming X to Jordan canonical form are known.
In the special case of a single Jordan block with n-fold eigenvalue λ, we have
Ψ−1 =
1 λ + α1 λ2 + α1λ + α2 · · · 0 1 2λ + α1 · · ·
0 0 1 · · ·
... ... ... . . .
,
where row number i + 1 is formed from row number i by differentiating with respect to λ and dividing by i.
We have a similar expression for Ψ:
Ψ =
. . . ... ... ...
· · · 1 2λ + β1 λ2 + β1λ + β2
· · · 0 1 λ + β1
· · · 0 0 1
The Jordan form is Ψ−1XΨ = J + λI, where Jij =δi,j+1. That is
Ψ−1XΨ =
λ 0 · · · 0 0 1 λ · · · 0 0 ... ... ... ...
0 0 · · · λ 0 0 0 · · · 1 λ
Inherent Runge–Kutta stability
Using doubly companion matrices, it is possible to construct GL methods possessing RK stability using
rational operations. The methods constructed in this way are said to possess “Inherent Runge–Kutta Stability”.
Apart from exceptional cases, (in which certain matrices are singular), we characterize the method with
r = s = p + 1 = q + 1 by the parameters λ single eigenvalue of A
c1, c2, . . . , cs stage abscissae Error constant
β1, β2, . . . , βp elements in last column of s × s doubly companion matrix X
Information on the structure of V
Consider only methods for which the step n outputs approximate the “Nordsieck vector”:
y1[n]
y2[n] y3[n]
...
yp[n+1]
≈
y(xn) hy′(xn) h2y′′(xn)
...
hpy(p)(xn)
For such methods, V has the form V =
1 vT 0 V˙
Such a method has the IRKS property if a doubly
companion matrix X exists so that for some vector ξ, BA = XB, BU = XV − V X + e1ξT , ρ( ˙V ) = 0 It will be shown that, for such methods, the stability
matrix satisfies
M(z) ∼ V + ze1ξT (I − zX)−1
which has all except one of its eigenvalues zero. The non-zero eigenvalue has the role of stability function
R(z) = N(z) (1 − λz)s
Proof of stability result
(I − zX)M(z)(I − zX)−1
= (V − zXV + z(B − z
XB = BA
XB)(I − zA)−1U)(I − zX)−1
= (V − zXV + zBU)(I − zX)−1
BU = XV − V X + e1ξT
= V + ze1ξT (I − zX)−1
This has a single non-zero eigenvalue equal to
R(z) = 1 + zξT (I − zX)−1e1 = det(I + z(e1ξT − X)) det(I − zX)
Example methods
The following third order method is explicit and suitable for the solution of non-stiff problems
"
AU BV
#
=
0 0 0 0 1 14 321 3841
−1885176 0 0 0 1 22373770 150802237 904802149
−335624311025 2955 0 0 1 16195911244100 260027904800 398112001517801
−678436435 39533 −5 0 1 294286435 527585 10296041819
−678436435 39533 −5 0 1 294286435 527585 10296041819
0 0 0 1 0 0 0 0
82
33 −27411 1709 −43 0 48299 0 −161264
−8 −12 403 −2 0 263 0 0
The following fourth order method is implicit, L-stable, and suitable for the solution of stiff problems
1
4 0 0 0 0 1 34 12 14 0
−54272513 14 0 0 0 1 2764954272 271365601 542721539 −6784459
3706119
69088256 −3819488 14 0 0 1 20726476815366379 34544128756057 690882561620299 −4545284854
32161061
197549232 −111814232959 134183 14 0 1−19754923232609017 32924872929753 329248724008881 3465776174981
−2948496135425 −10431641 18373 12 14 1 −8845488367313 −147424822727 98283240979 25864323
−2948496135425 −10431641 18373 12 14 1 −8845488367313 −147424822727 98283240979 25864323
0 0 0 0 1 0 0 0 0 0
2255
2318 −4712520862 447122 −114 43 0 −2874520862 −139081937 18544351 97665
12620
10431 −9638831293 3364549 −103 43 0 −7063431293 −104312050 −2318187 113366
414
1159 −2995431293 13061 −1 13 0 −2705231293 −10431113 −4636491 161732
Implementation questions
Many implementation questions are similar to those for traditional methods but there are some new challenges.
We want variable order and stepsize and it is even a realistic aim to change between stiff and non-stiff methods automatically.
Because of the variable order and stepsize aims, we wish to be able to do the following:
Estimate the local truncation error of the current step Estimate the local truncation error of an alternative method of higher order
Change the stepsize with little cost and with little impact on stability
My final comment is that all these things are possible and many of the details are already known and understood.
What is not known is how to choose between different choices of the free parameters.
Identifying the best methods is the first step in the
construction of a competitive stiff and non-stiff solver.
References
The following recent publications contain references to earlier work:
Numerical Methods for Ordinary Differential Equations, J. C. Butcher, J. Wiley, Chichester, (2003)
General linear methods, J. C. Butcher, Acta Numerica 15 (2006), 157–256.
Gracias Thank you
Obrigado