ComplexityMeasures.jl
ComplexityMeasures.jl copied to clipboard
Optimize transfer operator estimation
We were looking into reducing the overlap between our package StateTransitionNetworks.jl and ComplexityMeasures.
Since the transition matrix or transfer (Perron-Frobenius) operator plays a central role in our package, we also wrote code for it and we noticed that the approach here used in transferoperator
can be made more minimal and more performant.
This measures the time to get the transition matrix from a fairly long ($10^7$) orbit of the Hénon map using the two ways: transferoperator
from ComplexityMeasures
and calculate_transition_matrix
from StateTransitionNetworks
. 2D binning is used to discretize in both.
using DynamicalSystems
using BenchmarkTools
#Henon system orbit
henon_rule(x, p, n) = SVector{2}(1.0 - p[1]*x[1]^2 + x[2], p[2]*x[1])
henon = DeterministicIteratedMap(henon_rule, zeros(2), [1.4, 0.3])
orbit, t = trajectory(henon, 10^7; Ttr = 10^4)
grid_size = 10 #10x10 grid binning
#grid_size = 100 #100x100 grid binning
#-----------------STN way-------------------
using StateTransitionNetworks
@btime begin
symbolic_ts = timeseries_to_grid(orbit,grid_size) #get the symbolic time series
P_stn,Q_stn,x_stn = calculate_transition_matrix(symbolic_ts) #calculate transition matrix
P_stn
end
#for grid_size = 10 -> 1.480 s (20000065 allocations: 1.64 GiB)
#for grid_size = 100 -> 1.602 s (20000083 allocations: 1.64 GiB)
#----------------CM way----------------------
using ComplexityMeasures
@btime begin
binning = RectangularBinning(grid_size)
to = ComplexityMeasures.transferoperator(orbit, binning);
P_cm = to.transfermatrix
end
#for grid_size = 10 -> 2.098 s (1545 allocations: 828.16 MiB)
#for grid_size = 100 -> 3.333 s (19205 allocations: 830.85 MiB)
#are these the same?
println(P_stn.nzval[1:3]) #[0.5980376275657787, 0.7666443626630981, 0.6996943940406837]
println(P_cm.nzval[1:3]) #[5.894488653109343e-6, 0.5980376275657788, 0.766644362663098]
#yes except one additional value in P_cm which is almost zero
all(isapprox.(P_stn.nzval, P_cm.nzval[2:end];atol=1e-3)) #true
I think one of the reasons for these differences in runtime and scaling is that the visitors
list used in transferoperator
isn't actually necessary if we just want the transition matrix. Right now a list is created (that contains all visitors of each bin) and transition probabilities are computed by looping through the visitors list.
This is what I mean by making more minimal. If visitors
isn't actually used anywhere else, it can be skipped and the transition probabilities can be calculated by looping through the symbolic orbit (series of outcomes, pts
) once.
I've thrown together an initial PR just to show the difference: #423 (very rough, but some features can be added back later like boundary conditions, stochasticity checking etc. )
To illustrate this point further, I timed (using simple @time
inside transferoperator
) how much is actually taken up by creating visitors
and the calculation of probabilities:
#---------------------with visitors (original)-------------------
@time ComplexityMeasures.transferoperator(pts, binning);
visitors list time:
0.787510 seconds
loop time (compute probabilities):
0.257154 seconds
Total 1.747746 seconds
#---------------------without visitors (PR)-------------------
@time ComplexityMeasures.transferoperator(pts, binning);
loop time compute probabilities):
0.157713 seconds
Total: 0.730922 seconds
Let me know if this is worth the effort. There was a discussion about this on Slack with @Datseris and @sbulcsu but we decided to move it here. Also see #424 about generalizing the transfer operator because the design here might also change if we decide to introduce a transferoperator
that's not tied to binning.