ComplexityMeasures.jl icon indicating copy to clipboard operation
ComplexityMeasures.jl copied to clipboard

Reproducibility for `OrdinalPatternEncoding`

Open kahaaga opened this issue 1 year ago • 3 comments

Currently, the isless_rand function, which we use as the default value comparator for OrdinalPatternEncoding, will not give reproducible results. This is because we don't provide an rng argument to the rand call.

To solve this, we could either:

  • Force the user to provide their custom lt function, taking care of reproducibility themselves (not preferable IMHO)
  • Include rng as a field to both OrdinalPatternEncoding and OrdinalPatterns, which gets passed on to isless_rand.

Or maybe there's a better way?

kahaaga avatar Jan 13 '24 09:01 kahaaga

rng should be a field of the encoding. it doesn't have to be a field of OrdinalPatterns, only give as an input to it.

Datseris avatar Jan 14 '24 11:01 Datseris

will not give reproducible results

Note that this is only a problem if there are duplicate datapoints in the timeseries.

Datseris avatar Jan 14 '24 11:01 Datseris

Note that this is only a problem if there are duplicate datapoints in the timeseries.

The problem occurs if there are tied values inside any state vector, since it is the individual state vectors that are sorted to map them onto an ordinal pattern.

kahaaga avatar Jan 17 '24 01:01 kahaaga