ComplexityMeasures.jl Discussion on invariant measure estimation

Let's say I'm interested in getting the transfer matrix and the invariant measure in case of a time series.

using DynamicalSystems

henon_rule(x, p, n) = SVector{2}(1.0 - p[1]*x[1]^2 + x[2], p[2]*x[1])
henon = DeterministicIteratedMap(henon_rule, zeros(2), [1.4, 0.3])
orbit, t = trajectory(henon, 20_000; Ttr = 500)

As the docs say we can get an estimate for the invariant measure by using probabilities:

using ComplexityMeasures
grid_size=20
grid_along_1d = range(-2,2;length=grid_size)
grids = (grid_along_1d,grid_along_1d)
binning = FixedRectangularBinning(grids)

p_est = probabilities(RelativeAmount(),ValueBinning(binning),orbit)  #estimate by counting visitations

According to the docs, using probabilities with TransferOperator gives the invariant measure (the stationary distribution over the states) instead of just an estimate from counting:

ρ_to = probabilities(TransferOperator(binning),orbit)

If I understand correctly, another way of doing this is with invariantmeasure:

iv = invariantmeasure(orbit,binning)

ρ_to and iv.ρ are not exactly the same, although very similar:

iv.ρ[1] == ρ_to[1] #false
iv.ρ[1] ≈ ρ_to[1] #true

Is this difference because of the random initialization of the distribution?

If yes, the docs are not really clear on this issue (at least for me).

The TransferOperator docs say "In practice, the invariant measure ρ is computed using invariantmeasure, which also approximates the transfer matrix. The invariant distribution is initialized as a length-N random distribution which is then applied to P".

But then the invariantmeasure docs say that the iterative process starts from the estimated distribution from the counting, not random: "Estimate an invariant measure over the points in x based on binning the data into rectangular boxes dictated by the binning, then approximate the transfer (Perron-Frobenius) operator over the bins. "

Wouldn't it be more efficient if the invariant distribution was initialized as p_est , not a random vector?

Another curiosity: is there a way to get the P transfer matrix without calling invariantmeasure first?

Let me know if I misunderstood anything.

Dec 02 '23 15:12 rusandris

cc @kahaaga he is the expert on the transfer operator!

Dec 06 '23 09:12 Datseris

This might be a separate issue, but since it is related, I'll mention it here.

Why one get different number of states in the outcome space when using probabilities vs. invariantmeasure?

using DynamicalSystems

henon_rule(x, p, n) = SVector{2}(1.0 - p[1]*x[1]^2 + x[2], p[2]*x[1])
henon = DeterministicIteratedMap(henon_rule, zeros(2), [1.4, 0.3])
orbit, t = trajectory(henon, 20_000; Ttr = 500)

using ComplexityMeasures
grid_size = 20
binning = RectangularBinning(grid_size)
p_cm = probabilities(ValueBinning(binning),orbit) #100 states

which gives a 100 states.

iv = invariantmeasure(orbit,binning)
P_cm,symbols = transfermatrix(iv) #101 states??

which in contrast gives 101 states, despite using the same binning.

Dec 08 '23 13:12 rusandris

Hey @rusandris!

Is there a way to get the P transfer matrix without calling invariantmeasure first?

Yes. The transfer matrix is estimated first, then the stationary distribution is estimated from the transfer matrix. Given your example, you can do

# This will be a `TransferOperatorApproximationRectangular`, which is just a simple struct that 
# stores relevant information about the transfer operator and its binning.
julia> to = ComplexityMeasures.transferoperator(orbit, binning);
julia> to.transfermatrix
32×32 SparseArrays.SparseMatrixCSC{Float64, Int32} with 92 stored entries:
⎡⠫⠣⡀⠀⡂⠠⠀⠀⠢⠀⠀⠀⠀⠀⠀⠀⎤
⎢⣀⢀⠈⠢⠀⠀⠂⠈⠀⠊⠀⠂⠠⠄⠀⠠⎥
⎢⠠⠠⡈⠀⠈⠢⠈⠈⠈⠀⠐⠈⠀⠀⠀⠀⎥
⎢⠀⠀⠂⠈⠢⡀⠠⠢⠀⠀⠐⡀⢄⡀⠁⠀⎥
⎢⠁⠀⠂⡠⠂⠀⠈⠀⠈⠀⡀⠈⠀⠀⠀⡀⎥
⎢⠀⠀⠀⠠⠀⠐⠑⡀⠐⠀⢀⠀⠀⠈⠌⠀⎥
⎢⠒⠀⠀⢰⠀⡁⠀⠀⠀⠀⠈⠀⢀⠀⠄⠀⎥
⎣⣈⠈⠀⠀⠀⠀⠄⠀⠀⠀⠀⠄⠐⠆⠀⠀⎦

The (i,j)th entry of this matrix is the transition probability from state (bin) i to bin j, and you can see that the number of bins is the same as the dimension of the transfermatrix:

julia> to.bins
32-element Vector{Int64}:
 203
 202
 204
 201
 187
 214
 141
 184
 186
 216
   ⋮
 162
 217
 197
 164
 163
 160
 185
 159
 140
 183

To get the invariant measure, now simply do:

iv = invariantmeasure(to)

According to the docs, using probabilities with TransferOperator gives the invariant measure (the stationary distribution over the states) instead of just an estimate from counting: If I understand correctly, another way of doing this is with invariantmeasure:

The estimation procedure for the stationary distribution over the states is:

Bin the data
Estimate an N-by-N-sized transfer matrix T using ComplexityMeasures.transferoperator (as shown above), where N is the number of bins.
Apply T repeatedly to any length-N distribution p until convergence. By convergence, I mean "pdoesn't change beyond some semi-arbitrarily determined threshold". The point of this is to estimate the left-eigenvector of T.
The resulting distribution is the invariant measure, or the stationary distribution over the states.

The reason why you're seeing slightly different results in your example is that you're essentially estimating the invariant distribution twice (two separate calls to the same underlying function). Since the distribution p is randomly initialized, you're getting similar but slightly different results. It should be possible to specify an rng argument to TransferOperator and the underlying functions - then you'd get identical results for both calls. I'll open a separate issue for this.

But then the invariantmeasure docs say that the iterative process starts from the estimated distribution from the counting, not random: "Estimate an invariant measure over the points in x based on binning the data into rectangular boxes dictated by the binning, then approximate the transfer (Perron-Frobenius) operator over the bins. "

I'm not quite sure if I understand this comment. Perhaps it would be clearer if the sentence was re-ordered, e.g. "Bin the data into rectangular boxes according to binning, approximate the transfer (Perron-Frobenius) operator over the bins (i.e. the transfer matrix T), then find the left-eigenvector of this distribution by repeatedly applying M to a randomly initialized distribution p.

Wouldn't it be more efficient if the invariant distribution was initialized as p_est, not a random vector?

I suppose it is reasonable to think convergence could be achieved faster if p_est was used. I'm not sure what the runtime difference would be in practice, but it could be worth trying out.

Dec 10 '23 14:12 kahaaga

This might be a separate issue, but since it is related, I'll mention it here.

Why one get different number of states in the outcome space when using probabilities vs. invariantmeasure?
using DynamicalSystems

henon_rule(x, p, n) = SVector{2}(1.0 - p[1]*x[1]^2 + x[2], p[2]*x[1])
henon = DeterministicIteratedMap(henon_rule, zeros(2), [1.4, 0.3])
orbit, t = trajectory(henon, 20_000; Ttr = 500)

using ComplexityMeasures
grid_size = 20
binning = RectangularBinning(grid_size)
p_cm = probabilities(ValueBinning(binning),orbit) #100 states
which gives a 100 states.
iv = invariantmeasure(orbit,binning)
P_cm,symbols = transfermatrix(iv) #101 states?? 
which in contrast gives 101 states, despite using the same binning.

Nice catch! This is related to #328. You're getting an extra bin which is symbolized as -1 (i.e. the point is outside the binning; see third last element below).

to = ComplexityMeasures.transferoperator(orbit, binning);
julia> to.bins
101-element Vector{Int64}:
 275
 296
 314
 277
 332
 240
 382
  28
 155
 295
   ⋮
 260
 116
 279
  91
 217
 111
 385
  -1
 328
  75

I'm moving this into a separate issue.

Dec 10 '23 14:12 kahaaga

@rusandris Hey again. On main, you can now specify the rng argument to get reproducible results.

using Random
D = StateSpaceSet(rand(MersenneTwister(1234), 100, 2))
# Resetting rng will lead to identical results.
p1 = probabilities(TransferOperator(b; rng = MersenneTwister(1234)), D)
p2 = probabilities(TransferOperator(b; rng = MersenneTwister(1234)), D)
@test all(p1 .== p2) # true, distributions are identical, because we're starting with the same random vector for both p1 and p2 when estimating the invariant measure

# Not resetting rng will lead to non-identical results.
rng = MersenneTwister(1234)
p1 = probabilities(TransferOperator(b; rng), D)
p2 = probabilities(TransferOperator(b; rng), D)
@test !all(p1 .== p2) # true, distributions are not quite equal because we're using different random vectors for p1 and p2 when estimating the invariant measure
@test p1[1] ≈ p2[1] # But we should be close

Dec 10 '23 17:12 kahaaga

I suppose it is reasonable to think convergence could be achieved faster if p_est was used. I'm not sure what the runtime difference would be in practice, but it could be worth trying out.

Absolutely. It's not that obvious which method would be faster, and it might also depend on the length of the time series, resolution of the binning etc.

Dec 15 '23 10:12 rusandris

ComplexityMeasures.jl ComplexityMeasures.jl copied to clipboard

Discussion on invariant measure estimation

ComplexityMeasures.jl
ComplexityMeasures.jl copied to clipboard