Note: This notebook contains an interactive element which uses interact.jl
. That package does not work with Jupyter Notebooks
version 7
, nor with Jupyterlab
. I imagine they will fix this issue in the near future. Right now, to view the interactive elements click open in NBClassic
and un-comment the relevant lines. You will also need to install an extension.
Period of a Pendulum#
We begin our exploration of computational physics with an example from mechanics. We will consider the motion of a pendulum.
Imagine a pendulum of length \(L\) is pulled up to make an angle \(\theta\) with respect the the vertical.
The component of the graviational force which is in the direction of motion is \(F_\parallel=-m g \sin(\theta)\). The speed of the bob is \(v=L d\theta/dt\). Thus we get equations of motion
We know the pendulum will swing back and forth. We want to calculate its period.
The classic approach is to use energy arguments. The kinetic energy of the pendulum is
The potential energy is
Then the total energy is
Since the total energy is conserved we can solve this for the velocity:
Our argument is slightly simpler if we note that the energy can be related to the maximum angle that the pendulum reaches, \(\theta_m\), so:
Next we note that \(v=Ld\theta/dt\) and hence
We then note that a quarter period is the time it takes for the pendulum to rise from \(\theta=0\) to \(\theta=\theta_m\), and hence
This integral is an elliptic integral. It has no closed form in terms of more elementary functions. Thus we need to develop numerical methods to evaluate it.
The task of evaluating integrals is known as quadratures.
Adimensionalizing#
A good first step in any numerical calculation is to adimensionalize your equations. This is actually a good step in analytic work as well.
We will learn how to systematically do this, but here I know that for small amplitudes the period of a pendulum is
This it makes sense for us to calculate
The angle \(\theta_m\) is already dimensionless, so we can leave it as-is.
Strategies#
We have several strategies we could pursue now:
Type into google “Numerical Integration in Julia”. Download the right package and do the integral.
Write our own integrator.
Both are useful. The advantage of the package is that it will probably get us results sooner. The advantage of writing our own is that we can tweak it to meet our needs. Also, we will learn a lot from writing the integrator.
We are going to try to walk the line here – both giving you the skills to work with low-level algorithms – and the ability to take advantage of high level packages.
It is also worth noting that there is work involved with learning how to use a package. Generally it will be less work than writing your own routines – but that outlay should not be discounted.
Simple approaches to numerical quadratures#
Reading: Chapter 4 of Numerical Recipies
The basic idea with most quadrature approaches is to break the integral into discrete segments, and then approximate the integral on each of those segments. One straightforward approach is the “trapezoid rule” where one breaks each segment into trapezoids:
using Plots,LaTeXStrings #load in visualization software
# using Interact #uncomment for interactive content
#only works in classic notebooks
"""
showtrapezoids(f,range=(0,1),n=5;opts...)
Illustrates trapezoid rule by plotting the function `f` over the range specified by `range`.
Show `n` trapezoids overlayed. Passes on options `opts` to the plot command.
"""
function showtrapezoids(f,range=(0,1),n=5;opts...)
# define variables
xmin=range[1]
xmax=range[2]
dx=2*pi/50
# Create basic plot of function
plt=plot(f,xmin,xmax,xlabel="x",ylabel="f",legend=false,show=false,opts...)
#
# Draw trapezoids
#
function trapezoid(j)
width=(xmax-xmin)/n
x1=(j-1)*width
x2=j*width
y1=f(x1)
y2=f(x2)
Shape([x1, x1, x2, x2],[0,y1,y2,0])
end
for j in 1:n
plot!(trapezoid(j),opacity=0.2)
end
plt
end
showtrapezoids
f(x)=1-2x^2+4x^4
showtrapezoids(f,(0,pi/4),4)
#Uncomment for interactive content
# -- only works in classic notebooks
#
#@manipulate for n in 1:10
# p=showtrapezoids(f,(0,pi/2),n)
#end
Naive implementation of trapezoid rule#
Lets not worry about efficiency, and write the simplest implementation of the trapezoid rule that we can think of. We will make this more efficient later – but this version is already fast enough for our purposes.
"""
naive_trap(f,range,n)
is a naive implementation of the trapezoid rule for calculating
`∫f(x)dx`, with `x` running between `a` and `b` where `range=(a,b)`.
The integral region is divided into `n` trapezoids.
"""
function naive_trap(f,range,n)
result=0.
xmin=range[1]
xmax=range[2]
dx=(xmax-xmin)/n
for j in 1:n
x1=xmin+(j-1)*dx
x2=x1+dx
result+= (f(x2)+f(x1))*dx/2
end
return result
end
naive_trap
At the beginning of the function I supplied a “docstring”. This allows one to use the interactive help feature to learn about the function
?naive_trap
search: naive_trap
naive_trap(f,range,n)
is a naive implementation of the trapezoid rule for calculating ∫f(x)dx
, with x
running between a
and b
where range=(a,b)
. The integral region is divided into n
trapezoids.
testf(x)=x
naive_trap(testf,(1,0),100)
-0.4999999999999999
2^4/4
4.0
Notationally that is fine – but for things like this it is nice to be able to pass around “pure functions” or “lambda functions” or “anonymous functions”
naive_trap(x->x^3,(0,1),10)
0.25250000000000006
In that expression, x->x^3
is short-hand for creating the function testf
, and then inserting it into the function call. Here are a couple more examples of using this notation.
(x->x^2)(4)
16
g=(x->x+2)
g(5)
7
Now, one thing I hate is remembering exactly the notation for a function definition. A nice trick in Julia is to give multiple versions of the function that will work for any natural notation.
# alternate signature
naive_trap(f,a,b,n)=naive_trap(f,(a,b),n)
#keyword arguments
naive_trap(;f,range,n)=naive_trap(f,range,n)
naive_trap(f;range,n=10)=naive_trap(f,range,n)
naive_trap (generic function with 4 methods)
naive_trap(;g,space,m)=naive_trap(g,space,m)
naive_trap (generic function with 4 methods)
naive_trap(g=(x->x^2),space=(0,1),m=45)
0.3334156378600823
naive_trap(x->x^2,range=(2,5))
39.045
Here are now all the different ways we can call our function
# the original way
#-- matches `naive_trap(f,range,n)`
naive_trap(x->x^2,(0,2),4)
2.75
# matches `naive_trap(f,a,b,n)`
naive_trap(x->x^2,0,2,4)
2.75
# uses keyword arguments
naive_trap(f=(x->x^2),range=(0,2),n=4)
UndefKeywordError: keyword argument `g` not assigned
Stacktrace:
[1] top-level scope
@ In[19]:2
naive_trap(n=4,f=(x->x^2),range=(0,2))
UndefKeywordError: keyword argument `g` not assigned
Stacktrace:
[1] top-level scope
@ In[21]:1
dictargs=Dict(:f=>(x->x^2),:range=>(0,2),:n=>4)
naive_trap(;dictargs...)
UndefKeywordError: keyword argument `g` not assigned
Stacktrace:
[1] top-level scope
@ In[22]:2
ntupleargs=(f=(x->x^2),:range=>(0,2),n=4)
naive_trap(;ntupleargs...)
UndefKeywordError: keyword argument `g` not assigned
Stacktrace:
[1] top-level scope
@ In[23]:2
args=(x->x^2,(0,2),4)
naive_trap(args...)
2.75
Choosing n
#
How big of n
should we use? Lets see how our result depends on n
nvals=1:50
tvals=[naive_trap(x->x^3,(0,1),n) for n in nvals]
plot(nvals,tvals,xlabel="n",ylabel=L"$\int_0^1x^3dx$",legend=false)
scatter!(nvals,tvals)
We see that this converges pretty quickly. One trick to understand this is to rescale one of the axes. Lets try plotting the difference beteen the exact result and this approximation, as a function of \(1/n\).
plot(1 ./nvals,tvals.-0.25,xlabel="1/n",ylabel=L"$\int_0^1x^3dx$-0.25",legend=false)
scatter!(1 ./nvals,tvals.-0.25)
The notation tvals.-0.25
or 1 ./nvals
means to apply the operation elementwise.
x=(1,2,3)
x .-0.25
x-0.25
Aside over – Lets now look at that plot
plot(1 ./nvals,tvals.-0.25,xlabel="1/n",ylabel=L"$\int_0^1x^3dx$-0.25",legend=false)
scatter!(1 ./nvals,tvals.-0.25)
It kind of looks like a parabola. Lets plot against 1/n^2
plot(1 ./nvals.^2,tvals.-0.25,xlabel="1/n²",ylabel=L"$\int_0^1x^3dx$-0.25",legend=false)
scatter!(1 ./nvals.^2,tvals.-0.25)
Apparently the error scales as \(1/n^2\)
An even better approach is to use a log-log plot
scatter(nvals,abs.(-(tvals.-0.25)),xlabel="n",ylabel=L"$\int_0^1x^3dx$-0.25",
legend=false,xaxis=:log, yaxis=:log)
If a plot of \(\log(y)\) vs \(\log(x)\) is a a straight line: \(\log(y)=a+b \log(x)\), then \(y\) and \(x\) are related by a power law: \(y=\alpha x^\beta\), where \(\alpha=\exp(a)\) and \(\beta=b\). Lets try to fit our log-log plot to a straight line.
Later in the semester we will develop curve fitting routines, but here is a quick one.
My curve fitting program is literally 2 lines long, and illustrates why sometimes one does not want to bother with a package. I am sure it would take longer to Google the package than it does to write the function.
"""
linearfit(X,Y,functions)
takes a set of `X=(x1,x2,...xn)`, `Y=(y1,y1,...yn)` and a set of `functions=(f1, f2,...fm)`
and returns the set of coefficients `C=(c1,c2,..cm)` which minimize
n m
χ²= ∑ (yi- ∑ cj fj(xi))^2
i=1 j=1
It is implemented by constructing the `n×m` matrix `A` with matrix elements
Aᵢⱼ= fj(xi)
then `(AᵗA)C=AᵗY`, where `Y=(y1,y2,...yn)`. Thus the optimal coefficients are
`(AᵗA)⁻¹(AᵗY)`, which can be implemented with `(AᵗA)\\(AᵗY)`.
"""
function linearfit(X,Y,functions)
A=[f(x) for x in X, f in functions ]
return transpose(A)*A\(transpose(A)*Y) # This is equivalent to (AᵗA)⁻¹(AᵗY)
end # the "\" is "divide from left"
linearfit
?linearfit
search: linearfit
linearfit(X,Y,functions)
takes a set of X=(x1,x2,...xn)
, Y=(y1,y1,...yn)
and a set of functions=(f1, f2,...fm)
and returns the set of coefficients C=(c1,c2,..cm)
which minimize
n m
χ²= ∑ (yi- ∑ cj fj(xi))^2
i=1 j=1
It is implemented by constructing the n×m
matrix A
with matrix elements
Aᵢⱼ= fj(xi)
then (AᵗA)C=AᵗY
, where Y=(y1,y2,...yn)
. Thus the optimal coefficients are (AᵗA)⁻¹(AᵗY)
, which can be implemented with (AᵗA)\(AᵗY)
.
The divide from left notation may seem a little strange on a computer, but it makes sense.
1 / 2 # The normal way to take the ratio of 1 and 2
0.5
2 \ 1 # an alternative way to take the ratio of 1 and 2
lf1=linearfit(log.(nvals),log.(tvals.-0.25),(x->1.,x->x))
that output means that \(\log(t-0.25)=-1.386 - 2\log(n)\), or in other words, \(t=0.25+\exp(-1.386)n^2\).
scatter(nvals,tvals.-0.25,xlabel="n",ylabel=L"$\int_0^1x^3dx$-0.25",
legend=false,xaxis=:log, yaxis=:log)
plot!(x->exp(lf1[1]+lf1[2]*log(x)),1,100)
Thus we see that the error scales as \(1/n^2\). Can we understand this?
Theory tools#
Big O Notation#
We will often say that a function \(f_n\) scales as \(O(n^\alpha)\) as \(n\to\infty\). This means that there exists an integer \(N\) and a constant \(C\) such that \(|f_n|\leq Cn^\alpha\) for all \(n>N\). That is, we can bound \(f_n\) by some constant times \(n^\alpha\). Note that this is an upper bound, and does not mean the bound is tight.
Here is an example showing that y=1/(x-1)
is bounded by 2/x
for x>1
. Thus we say \(1/(x-1)=O(x^{-1})\).
plot(x->2/x,2,10,fillrange = [0,0], fc=:blues,label="2/x",xlabel="x",ylabel="y")
plot!(x->1/(x-1),1.5,10,label="1/(x-1)",linewidth=3)
Sometimes we also want to talk about tight bounds. In this course we will write \(f_n\sim n^\alpha\) as \(n\to\infty\) to mean that there exists a constant \(C\) such that \(|f_n- C n^\alpha|\) vanishes as \(n\to\infty\).
In the case where \(f_n\) itself vanishes as \(n\to\infty\), we actually want this to be a little bit stronger, and will actually require \(|f_n-C n^\alpha|/|f_n|\) vanishes as \(n\to\infty\). IE. We want the difference between the curves to vanish faster than \(f_n\) itself.
Here is an example. We would write \(1/(x-1)\sim 1/x\) as \(x\to\infty\). Note, we could also write \(1/(x-1)=1/x+O(x^{-2})\), which conveys more information.
plot(x->1/x,2,10,fillrange = x->1/(x-1), fc=:blues,label="1/x",xlabel="x",ylabel="y")
plot!(x->1/(x-1),1.5,10,label="1/(x-1)",linewidth=3)
We will often leave off the “\(n\to\infty\)” part of the statement. Sometimes we will take a different limit – for example we could say \(f(x)=O(x^{-2})\) as \(x\to0\), to mean that for small \(x\) \(|f(x)|\) is bounded from above by a function proportional to \(x^{-2}\).
We can also talk about \(O(g(x))\) for functions other than power laws.
Richardson Extrapolation#
If we know that the error in the trapezoid rule scales as \(n^{-2}\) we can do the integral for several values of \(n\) and then fit to a function of the form \(I_n=I_\infty + \epsilon n^{-2}\) – extracting a better approximation to the integral.
I4=naive_trap(x->x^3,(0,1),4)
I8=naive_trap(x->x^3,(0,1),8)
We then multiply by \(n^2\),
We then take two values of \(n\) – say \(n=4\) and \(n=8\).
Subtract these
or
(2^2*I8-I4)/(2^2-1)
This was exact in this case because of the form of the integrand. In general, if we used \(I_n\) and \(I_{n/2}\) we would still have an error – but instead of scaling as \(n^{-2}\), it would scale as \(n^{-4}\). The approximation \(I\approx (4/3)I_n-(1/3)I_{n/2}\) is known as Simpson’s Rule
This trick can be repeated. I can take Simpson’s rule with two different \(n\)’s, and combine them to get an even better approximation to the integral. This strategy is known as Romberg Integration.
Caution: Richardson extrapolation (and Romberg integration schemes) rely on our integrand being smooth. If it is poorly behaved, then they don’t necessarily help.
Euler-Maclaurin Summation formula#
A good framework for understanding these approximations in the Euler-Maclaurin Summation Formula. Here is a somewhat formal derivation. This is mostly for my own amusement – and this derivation contains one or two cheats.
We will first remind you about series expansions. Analytic functions can be written in series, such as
Such a series is sometimes used to define a set of numbers. We are going to need the Bernoulli numbers, which are defined by the series $\( \frac{x}{e^{x}-1}=\sum_j \frac{B_j}{j!} x^j. \)\( One can Brute-force expand the left hand side to get \)B_0=1\(, \)B_1=-1/2\(, \)B_2=1/6\(, \)B_3=0\(, \)B_4=-1/30,\cdots$ You can easily Google them.
With Bernoulli numbers in hand we will now think about operators that act on the space of functions. They take in a function, and spit out a new function. For example, lets intoduce \(\Delta\), which is defined by \((\Delta f)(x)=f(x+1)-f(x)\). In Julia we can create this operator with
Δ(f)=(x->f(x+1)-f(x))
or if that notation makes you queezy, you could also write
function Δ(f)
function g(x)
return f(x+1)-f(x)
end
return g
end
f1(x)=x^2
f2=Δ(f1)
@show f1(5)-f1(4)
@show f2(4);
We will next introduce \(\Sigma\). It again maps functions onto functions. We will only need to define what it does to functions on integers. It is defined for positive integers \(x\) as $\( (\Sigma f)(x)=\sum_{n=0}^{x-1} f(n) \)$
I won’t write a function that does this, but you can imagine how to do it. I claim that \(\Delta\Sigma=I\), where \(I\) is the identity matrix. To prove this one writes
All of the terms cancel except the last, which gives, \((\Delta\Sigma f)(x)=f(x)\), which shows that \(\Delta\Sigma\) is the identity operator. This makes sense, \(\Delta\) and \(\Sigma\) are inverses of each-other.
In the other order, however, we find \((\Sigma\Delta f)(x)=f(x)-f(0)\), so the order matters.
Next we introduce two more operators
The fundamental theorem of calculus states that \(\partial\smallint= I.\) As with the \(\Sigma\) and \(\Delta\), if we reverse the order we get \((\smallint\partial f)(x)=f(x)-f(0)\).
We now turn to Taylor’s theorem,
Thus Taylor’s theorem can be written as \(\Delta=e^{\partial}-1\)
Putting this together with our previous result, \(\Delta\Sigma=I\) yields
We will now multiply the right hand side by \(I=\smallint \partial \) to find
Finally we expand the right hand side in a Taylor series:
which can be reorganized into the Euler-Maclauren summation formula
The first two terms give you the trapezoid rule. Subsequent terms give you corrections in terms of derivatives of the function at the bounds.
This may seem strange that knowledge about the behaviour of the function at the boundaries so strongly constrain the integral. This is a peculiarity of analytic functions.
Finally we rescale the axis, taking the bounds of the integral to go from \(a\) to \(b\). Thus we define \(s=a+b y/n\), and \(g(s)=f(n(s-a)/b)\). Straightforward arithmetic then gives
Improving our trapezoid rule integrator#
As a reminder, here is our naive implementation of the trapezoid rule
function naive_trap(f,range,n)
result=0.
xmin=range[1]
xmax=range[2]
dx=(xmax-xmin)/n
for j in 1:n
x1=xmin+(j-1)*dx
x2=x1+dx
result+= (f(x2)+f(x1))*dx/2
end
return result
end
naive_trap (generic function with 4 methods)
Algorithmic Inefficiencies:
The function
f
is evaluated twice at every location.We have
2n
muliplications bydx
.We don’t use Richardson
Inconveniences:
We need to specify
n
, the number of trapezoids,We don’t know how accurate our result is
Note, as discussed at the end of this notebook, there are other more subtle inefficiencies due to actual implementation by the compiler.
Fixing the most obvious inefficiency#
The factor of 2 is easy to fix. Instead of looping over the trapezoids we loop over the points.
"""
trap(f,range,n)
is an implementation of the trapezoid rule for calculating
`∫f(x)dx`, with `x` running between `a` and `b` where `range=(a,b)`.
The integral region is divided into `n` trapezoids.
"""
function trap(f,range,n)
xmin=range[1]
xmax=range[2]
dx=(xmax-xmin)/n
result=(f(xmin)+f(xmax))/2
for j in 1:n-1
result+= f(j*dx)
end
return result*dx
end
trap
To see a speed difference, however, one needs to go to a more complicated integrand and a very large number of trapezoids.
@time naive_trap(x->sin(log(cos(x)^2+1)),(0.,pi),2^22)
0.139106 seconds (5.57 k allocations: 380.375 KiB, 4.32% compilation time)
1.1222120074972937
@time trap(x->sin(log(cos(x)^2+1.)),(0.,pi),2^22)
0.071102 seconds (5.58 k allocations: 370.992 KiB, 6.57% compilation time)
1.1222120074973119
@time trap(x->sin(log(cos(x)^2+1.)),(0.,pi),2^22)
0.069529 seconds (5.46 k allocations: 363.070 KiB, 7.31% compilation time)
1.1222120074971986
For smaller number of trapezoids, or simpler integrands, the time is dominated by the compiling, and this factor of 2 is irrelevant. Note, for this example we used 4 million points – which was certainly overkill
2^22
@time naive_trap(x->x^3,(0.,1.),1000)
@time trap(x->x^3,(0.,1.),1000)
Dynamic step size#
Lets implement a strategy from Numerical Recipies which does the Trapezoid rule with dynamic step sizes
One reason I want to go through it, is that it introduces some useful computer science – namely using objects. Objects are the nouns of programs – just as functions are the verbs.
This approach uses subdivides the integral into \(n=2^m\) trapezoids. We are going to sequentially increase \(m\) until we achieve our desired accuracy. In order to do this we define an object trap_data
which stores information about the integral that you are using, and an interim result.
"""
trap_data(f,range)
stores the intermediate information for our trapezoid rule.
Fields:
f # store the function we are integrating
range # store the range (a,b)
val # trapezoid rule integral at current depth, m
m # depth -- number of trapezoids = 2ᵐ
To increase the depth use command `subdivide!` or `subdivide`.
"""
mutable struct trap_data
f # store the function we are integrating
range # store the range (a,b)
val # trapezoid rule integral at current depth, m
m # depth -- number of trapezoids = 2ᵐ
function trap_data(f,range)
new(f,range,(f(range[1])+f(range[2]))*(range[2]-range[1])/2,0)
end
end
"""
subdivide!(data ::trap_data)
takes a `trapdata` object, and doubles the number of gridpoints in the integral.
"""
function subdivide!(data ::trap_data)
# Extract needed info from `data`
f=data.f
m=data.m+1
result=zero(data.val)
(a,b)=data.range
# Width of new rectangles
dx=(b-a)/2^m
for x in dx:dx*2:dx*(2^m-1)
result+=f(a+x)*dx
end
result=result+data.val/2
# Update data
data.m=m
data.val=result
end
subdivide!
for x in 1:0.5:25
println(x)
end
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
5.5
6.0
6.5
7.0
7.5
8.0
8.5
9.0
9.5
10.0
10.5
11.0
11.5
12.0
12.5
13.0
13.5
14.0
14.5
15.0
15.5
16.0
16.5
17.0
17.5
18.0
18.5
19.0
19.5
20.0
20.5
21.0
21.5
22.0
22.5
23.0
23.5
24.0
24.5
25.0
We will learn more about this syntax as the course progresses. Nevertheless, to see how this work, we create one of our data objects
td1=trap_data(x->x^3,(1,3))
trap_data(var"#64#65"(), (1, 3), 28.0, 0)
We can extract the stored information with a dot
td1.range
(1, 3)
td1.f(4)
64
g=td1.f
#64 (generic function with 1 method)
g(7)
343
r=td1.range
(1, 3)
td1.val
28.0
td1.m
0
naive_trap(td1.f,td1.range,2^td1.m)
28.0
We can then use the subdivide!
function to break the integral into more pieces
subdivide!(td1)
22.0
td1.m
1
(td1.m,td1.val,naive_trap(x->x^3,(1,3),2^td1.m))
(1, 22.0, 22.0)
subdivide!(td1)
20.5
(td1.m,td1.val,naive_trap(x->x^3,(1,3),2^td1.m))
(2, 20.5, 20.5)
Every time we call subdivide we increase the number of trapezoids. However, we just evaluate f
at the new points.
We now just need to wrap this in a function which calls subdivide until we hit some convergence criterion. We will use that the fractional change in the integral falls below some threshold – that we will call eps
.
"""
`nrtrap(f,range;namedargs...)`
Uses the trapezoid rule algorithm from Numerical Recipies to calculate ``\\int_a^b f(x) dx``
where `range=(a,b)`.
`namedargs`
- `eps=1E-10`: algorithm terminates when relative change in integral falls below `eps`
- `max_divisions=20`: make no more than `max_divisions` subdivisions
- `debug=false`: set to `true` to see debug informations
"""
function nrtrap(f,range;eps=1E-10,max_divisions=20,debug=false)
data=trap_data(f,range)
oldval=0.
val=0.
for j in 1:max_divisions
oldval=data.val
subdivide!(data)
val=data.val
if abs(val-oldval)<eps*abs(oldval) || (val==0 && oldval==0)
if debug
print("converged after "*string(j)*" iterations")
end
return val
end
end
error("Did not converge to precision "*string(eps)*
"after "*string(max_divisions)*" subdivisions."*
"previous 2 results: ("*string(oldval)*
","*string(val)*")")
end
nrtrap
"a"*"b"
"ab"
nrtrap(x->x^3,(0,1),debug=true)
converged after 18 iterations
0.25000000000363815
nrtrap(x->x^4,(0,1),debug=true)
converged after 18 iterations
0.20000000000485063
It is a simple change to upgrade this to Simpson’s rule
"""
`nrsimp(f,range;namedargs...)`
Uses the Simpson's rule algorithm from Numerical Recipies to calculate ``\\int_a^b f(x) dx``
where `range=(a,b)`.
`namedargs`
- `eps=1E-10`: algorithm terminates when relative change in integral falls below `eps`
- `max_divisions=20`: make no more than `max_divisions` subdivisions
- `debug=false`: set to `true` to see debug informations
"""
function nrsimp(f,range;eps=1E-10,max_divisions=20,debug=false)
data=trap_data(f,range)
oldtrapval=data.val
subdivide!(data)
trapval=data.val
sval=(4*trapval-oldtrapval)/3
oldsval=sval
for j in 2:max_divisions
oldtrapval=trapval
oldsval=sval
subdivide!(data)
trapval=data.val
sval=(4*trapval-oldtrapval)/3
if abs(sval-oldsval)<eps*abs(oldsval) || (sval==0 && oldsval==0)
if debug
print("converged after "*string(j)*" iterations")
end
return sval
end
end
error("Did not converge to precision "*string(eps)*
"after "*string(max_divisions)*" subdivisions."*
"previous 2 results: ("*string(oldsval)*
","*string(sval)*")")
end
nrsimp
nrsimp(x->x^4*log(x+sqrt(1+x^2)),(0,2),debug=true)
converged after 10 iterations
8.153364119820747
nrtrap(x->x^4*log(x+sqrt(1+x^2)),(0,2),debug=true)
converged after 18 iterations
8.153364120069917
2^10
1024
2^18
262144
Optimization#
I probably won’t get to this in class – but you may find it useful to read.
Note: In general optimization is the last thing your should worry about.
First, in a complex program there is usually one or two rate limiting steps. It is hard to predict what these will be a-priori. Optimization of any part of the code, except those rate limiting steps, is wasted.
Second, optimization is complicated. It depends on not only the programming language, but the implementation of the language, the compiler, the compiler directives… It is often a trial and error process – and sometime you discover that your new approach is slower than the old. In general clean simple code will be fast.
Regardless, lets now see how we are doing with speed
@time nrtrap(x->x^3,(0,1),debug=true)
@time naive_trap(x->x^3,(0,1),2^18)
Oh no! Our fancy version is about 8 times slower than our naive version.
Fixing this is non-trivial, as the speed of these routines depends on low level internals. If you look at the @time
statement for our naive-trap
function, you see that the bulk of the time there is taken in compilation.
The other thing to note is that both versions make a large number of Heap allocations
. This has to do with the way the fact that we are passing around a function object, and the just-in-time compiler does not know what data type the function returns. Thus it has to allocate memory in a generic way.
We can fix this by compartmentalizing the functioncalls in our subdivide!
method.
"""
alt3_trap_data(f,range)
stores the intermediate information for our trapezoid rule.
Fields:
f # store the function we are integrating
range # store the range (a,b)
val # trapezoid rule integral at current depth, m
m # depth -- number of trapezoids = 2ᵐ
To increase the depth use command `subdivide!` or `subdivide`.
"""
mutable struct alt3_trap_data
f # store the function we are integrating
range # store the range (a,b)
val # trapezoid rule integral at current depth, m
m # depth -- number of trapezoids = 2ᵐ
function alt3_trap_data(f,range)
new(f,range,(f(range[1])+f(range[2]))*(range[2]-range[1])/2,0)
end
end
function subdivide!(data ::alt3_trap_data)
# Extract needed info from `data`
f=data.f
m=data.m+1
range=data.range
nextresult=sum_even_points(f,range,2^m)
result=nextresult+data.val/2
# Update data
data.m=m
data.val=result
end
function sum_even_points(f,range,n)
(a,b)=range
dx=(b-a)/n
result=f(a+dx)*dx
for j in 3:2:n-1
result+=f(a+j*dx)*dx
end
return result
end
"""
`alt3_nrtrap(f,range;namedargs...)`
Uses the trapezoid rule algorithm from Numerical Recipies to calculate ``\\int_a^b f(x) dx``
where `range=(a,b)`.
`namedargs`
- `eps=1E-10`: algorithm terminates when relative change in integral falls below `eps`
- `max_divisions=20`: make no more than `max_divisions` subdivisions
- `debug=false`: set to `true` to see debug informations
"""
function alt3_nrtrap(f,range;eps=1E-10,max_divisions=20,debug=false)
data=alt3_trap_data(f,range)
oldval=0.
val=0.
for j in 1:max_divisions
oldval=data.val
subdivide!(data)
val=data.val
if abs(val-oldval)<eps*abs(oldval) || (val==0 && oldval==0)
if debug
print("converged after "*string(j)*" iterations")
end
return val
end
end
error("Did not converge to precision "*string(eps)*
"after "*string(max_divisions)*" subdivisions."*
"previous 2 results: ("*string(oldval)*
","*string(val)*")")
end
@time nrtrap(x->x^3,(0.,1.),debug=true)
@time alt3_nrtrap(x->x^3,(0.,1.),debug=true)
Apparently we got somewhere between a factor of 5-10 speedup
We can do even better if we tell the compiler what types to expect from the function we are integrating
function f1(x::Float64)::Float64
return x^3
end
@time alt3_nrtrap(f1,(0.,1.),debug=true)
This gave us another factor of 20 in speed!! Note the drastic reduction in heap allocations.
Lets see how this tweek behaves for our original implementation:
@time nrtrap(f1,(0.,1.),debug=true)
Apparently the original form did not benefit at all from declaring the types for the input function.
Optimization is hard!