Associations.jl icon indicating copy to clipboard operation
Associations.jl copied to clipboard

Dimension of source embedding must be 1 to be applicable with surrogate methods ?

Open Datseris opened this issue 2 years ago • 2 comments

I am confused about the surrogate significance tests...

I get Dimension of source embedding must be 1 to be applicable with surrogate methods if I do something like

embedding = EmbeddingTE(; dS = 3, dT = 3, dC = 3)
estimator = FPVP()
test = SurrogateTest(TEShannon(; embedding), estimator; nshuffles = 100)

and then call independence. How does this limitation makes sense from a scientific perspective? Surely estimating the transfer entropy with only 1 dimension (and hence, no going into the past at all) from the source doesn't really make sense given the definition of transfer entropy, right?

How does this limitation come about? Why is it not possible to first shuffle the source timeseries and then embed it (which is what I would expect would happen)?

Datseris avatar Oct 05 '23 10:10 Datseris

This is also related to https://github.com/JuliaDynamics/TimeseriesSurrogates.jl/issues/136.

Surely estimating the transfer entropy with only 1 dimension rom the source doesn't really make sense given the definition of transfer entropy, right?

It does, because TE(x-y) = CMI(y(t+1), x(t)^(-) | y(t)^(-)), where ^(-) indicated an embedding with negative lags. The embedding may be 1-dimensional, meaning that you just use the raw time series. This is perfectly valid, and for very short time series, a 1-dimensional embedding for each marginal is the only reasonable thing to do.

Why is it not possible to first shuffle the source timeseries and then embed it (which is what I would expect would happen)?

This is a design choice. In the current implementation, I decided to shuffle the relevant marginal after embedding (I think because I figured it would save some allocations of new StateSpaceSets for all marginals for every surrogate realization). It is also, of course, possible to shuffle before embedding. I think we should enable both approaches.

kahaaga avatar Oct 05 '23 11:10 kahaaga

In fact, we have to enable both approaches to not be too restrictive here. It may happens that one wants to consider multiple timeseries together as the source, and then one would have to use multidimensional surrogates, like described in https://github.com/JuliaDynamics/TimeseriesSurrogates.jl/issues/136

kahaaga avatar Oct 05 '23 12:10 kahaaga