ComplexityMeasures.jl icon indicating copy to clipboard operation
ComplexityMeasures.jl copied to clipboard

Fluctuation complexity, restrict possibilites to formally defined self-informations

Open kahaaga opened this issue 8 months ago • 7 comments

What's this?

Here I address #410 and restrict the fluctuation complexity to information measures for which it is possible to define "self-information" in the following sense.

Given an information measure H, I define the "generalized" self-information as the functional I(p_i) that allows us to re-write the expression for H as a probability-weighted sum H = sum_i (p_i I(p_i)) ( a weighted average, but since sum(p_i) = 1, the denominator in the weighted average doesn't appear explicitly).

Next, the fluctuation complexity is the square root of sum_{i=1}^N p_i(I(p_i) - H)^2). Hence, using the formulation above, we can meaningfully speak about a fluctuation of local information around the mean of information, regardless of which measure is chosen.

I also require that the generalized self-information will yield a fluctuation complexity that have the same properties as the original Shannon-based fluctuation complexity:

  • that is zero for uniform distribution
  • that it is zero for distributions where p_k = 1 and p_i = 0 for i != k.

Note that we don't involve the axioms which Shannon self-information fulfill at all: we only demand that the generalized self-information is the functional with the properties above. I haven't been able, at least until now, to find any papers in the literature that deals with this concept for Tsallis or other generalized entropies, so I think it is safe to explore with this naming convention.

New design

  • I introduce a new API method self_information(measure::InformationMeasure, p_i, args...).
  • This method is called inside information(measure::FluctuationComplexity, args...) to compute the I(p_i) terms inside the sum of the fluctuation complexity. Only measures that implement self_information are valid, otherwise an error will be thrown.

Progress

I've made the necessary derivations for the measures where calculations looked easiest: Shannon entropy/extropy, Tsallis entropy and Curado entropy. I'll fill in the gaps for the rest of the measures whenever I get some free time.

I'm writing this all up in a paper, where I also highlight ComplexityMeasures.jl and how easy it is to use the measure practically due to our discrete estimation API. I've essentially finished the intro and method, but the experimental part remains to be done. For that, I need functional code. So before I proceed, I'd like to get your input on this code proposal, @Datseris. Does this dispatch-based system make sense?

Pending the paper, I verify correctness by numerical comparison in the test suites. I re-write the information measures as weighted sums involving self_information, and check that we obtain the same value as if computing the measure using the traditional formulations.

kahaaga avatar Jun 10 '24 07:06 kahaaga