Distributions.jl icon indicating copy to clipboard operation
Distributions.jl copied to clipboard

`LogitNormal` disrupts types

Open bgctw opened this issue 2 years ago • 8 comments

For some number-crunching project we work with Float32 instead of Float64. Most distributions in Distributions.jl allow to construct a Float32-based version by providing a Type parameter, nice.

However, when I transform a Float32 based LogitNormal distribution (e.g. shift mean by +) the mean applied to the distribution gives me a Float64 instead of a Float32.

The reason is

  • AffineTransformation uses eltype(basedist) to determine its number and parameter type
  • default eltype of an uniform continuous distribution is Float64

The normal distribution already defines Base.eltype(::Type{Normal{T}}) where {T} = T

Should we add such a method to all the other distributions that are based on a parametric type?

Here is a minimal (not) working example

    dln = LogitNormal{Float32}(0f0, sqrt(2f0))
    dshifted = dln * 2f0 + 1f0
    @test params(dshifted)[1] isa Float32
    @test mean(dshifted) isa Float32

bgctw avatar Sep 05 '23 12:09 bgctw

@devmotion commented on a commit: "The parameter type of a distribution can be retrieved by partype, eltype is supposed to be the (element) type of rand"

Computing the mean of distribution, to my understanding, should return the same type as a random number drawn. Moreover, all the uniform continuous distributions which I looked at, enforce the same parametric type as the type of the parameters.

This also leads to some strange behaviour such as: d = LogUniform(1,10) leading to parametric type Int64 but the Tests still check @test eltype(d) === Float64

Are the terms, concepts, and types involved described somewhere? My expectation would be the Type Hierarchy section of the documentation. However, the parametric types of the distributions and the types of the parameters are not really clarified there. Could you, please, point me to where I can read and understand these concepts?

bgctw avatar Sep 06 '23 14:09 bgctw

Presumably there's no documentation because people strongly disagree about the design and its implications, as shown in #1071 and its many duplicates. I had a go at it (and some other issues) in https://github.com/JuliaStats/Distributions.jl/pull/1433 but the PR got stuck since I had to work on other things.

devmotion avatar Sep 06 '23 15:09 devmotion

julia> eltype(dln)
Float64
julia> typeof(rand(dln))
Float64

So it seems like the bug is in LogitNormal, not in AffineDistribution.

ParadaCarleton avatar Sep 07 '23 06:09 ParadaCarleton

@ParadaCarleton: In my suggestion and proposed commit, I actually changed LogitNormal (and many other distributions) but changing eltype breakes many tests. Currently, many tests seem to expect the eltype of a continuous distribution to be fixed to Float64.

My next suggestion would be to make AffineDistribution use partype instead of eltype. However, this breaks tests for the discretenonparametric, where rational numbers provided to the transformation are promoted to Float64 or Float32.

bgctw avatar Sep 07 '23 07:09 bgctw

As a workaround I suggest providing a constructor to AffineDistribution that allows the user to explicitly specify the type of mu and sigma to be Float32 at his own risk.

bgctw avatar Sep 07 '23 07:09 bgctw

In my suggestion and proposed commit, I actually changed LogitNormal (and many other distributions) but changing eltype breakes many tests. Currently, many tests seem to expect the eltype of a continuous distribution to be fixed to Float64.

That's bizarre, to say the least. If that happens, those tests need to be fixed.

My next suggestion would be to make AffineDistribution use partype instead of eltype. However, this breaks tests for the discretenonparametric, where rational numbers provided to the transformation are promoted to Float64 or Float32.

Using partype would be incorrect, though. partype and eltype can be different. eltype is supposed to be the type of rand(r).

As a workaround I suggest providing a constructor to AffineDistribution that allows the user to explicitly specify the type of mu and sigma to be Float32 at his own risk.

That's a pretty ugly hack. Julia functions are supposed to propagate the types of their arguments.

ParadaCarleton avatar Sep 07 '23 17:09 ParadaCarleton

It's a deliberate design choice that rand returns samples of type Float64. There are multiple discussions in this repo about eltype/partype and rand, that also explain and discuss the current design. I guess we should try to avoid replicating those in this issue here.

devmotion avatar Sep 07 '23 20:09 devmotion

It's a deliberate design choice that rand returns samples of type Float64.

Where was that choice made? By who?

ParadaCarleton avatar Sep 08 '23 00:09 ParadaCarleton