IntervalSets.jl icon indicating copy to clipboard operation
IntervalSets.jl copied to clipboard

Should eltype(1..2) be Float64? (currently it is Int)

Open zsunberg opened this issue 3 years ago • 33 comments

It seems quite strange to me that eltype(1..2) is Int because the elements in the mathematical set [1,2] are real numbers, and real numbers are usually represented by Float64.

Happy to make a PR if people support this.

zsunberg avatar Aug 18 '22 04:08 zsunberg

This has caused confusion in JuliaReinforcementLearning/CommonRLSpaces.jl#12.

zsunberg avatar Aug 18 '22 04:08 zsunberg

Why Float64?

julia> big(π) in 0..4
true

I think eltype is just a poorly defined concept here....

dlfivefifty avatar Aug 18 '22 09:08 dlfivefifty

I'm not sure we need eltype function for non-iterable objects such as ClosedInterval.

The most consistent way would be rename eltype(::Interval) to boundstype(::Interval), I guess.

hyrodium avatar Aug 18 '22 09:08 hyrodium

I once had a crazy idea of implementing iterate using nextfloat. This would make sense of eltype (so iterate(0..1) would be equivalent to [0,1]). Unfortunately there are a lot of floats so it was a completely useless idea!

So I think removing eltype would be wise. But this also effects DomainSets.jl. @daanhb thoughts?

dlfivefifty avatar Aug 18 '22 10:08 dlfivefifty

I think the ship has sailed, up to some details. The choice in this package is that an interval is defined by the membership function a <= x <= b, regardless of what the types of the endpoints and of x are. This is the most sensible mathematical definition of the continuous interval. In developing DomainSets, so far anything that would lead to restrictions on types over mathematical flexibility or correctness has turned out to be a bad idea. So, first of all, I'm happy with an interval defined by integers containing real numbers (and anything else that can be compared to an integer).

Personally, I'm also happy with having eltype. Consider this example:

julia> A = [1,2,3]
3-element Vector{Int64}:
 1
 2
 3

julia> eltype(A)
Int64

julia> 2.0 ∈ A
true

What IntervalSets does is just the continuous analogue of that, largely. For the example above, nothing else would make sense. It would be weird to make the eltype be Float64. It would also be bizarre if 2.0 isn't a member, since after all 2.0 == 2.

Of course it is annoying that 1..2 has eltype set to Int by default, because usually one wants it to be Float64. From a user-friendliness perspective I could imagine changing that default, as long as the construction with Ints also remains possible, given that the focus of this package is really on continuous sets. It is safe to guess every user runs into this issue at some point.

In DomainSets the T in Domain{T} is crucial in many ways and I'd prefer to just call it what it is, the eltype :-)

daanhb avatar Aug 18 '22 10:08 daanhb

Perhaps we could have 1..2 turn into 1.0..2.0, because the notation implies the intention of a continuous set, while leaving Interval(1,2) as it is?

daanhb avatar Aug 18 '22 10:08 daanhb

Perhaps we could have 1..2 turn into 1.0..2.0, because the notation implies the intention of a continuous set, while leaving Interval(1,2) as it is?

We currently support Date type, so converting Int to Float64 seems inconsistent.

julia> using Dates, IntervalSets

julia> Date(2022,08,12) .. Date(2022,09,12)
2022-08-12..2022-09-12

julia> Date(2022,09,10) in ans
true

hyrodium avatar Aug 18 '22 14:08 hyrodium

We currently support Date type, so converting Int to Float64 seems inconsistent.

One possible rule would be: if the bounds are <: Real, call float on them.

zsunberg avatar Aug 18 '22 15:08 zsunberg

How about Rational? I think this should not be converted with float.

hyrodium avatar Aug 18 '22 15:08 hyrodium

Perhaps we could have 1..2 turn into 1.0..2.0, because the notation implies the intention of a continuous set, while leaving Interval(1,2) as it is?

We currently support Date type, so converting Int to Float64 seems inconsistent.

julia> using Dates, IntervalSets

julia> Date(2022,08,12) .. Date(2022,09,12)
2022-08-12..2022-09-12

julia> Date(2022,09,10) in ans
true

Thanks, that's a cool example.

daanhb avatar Aug 18 '22 15:08 daanhb

I'm not sure we need eltype function for non-iterable objects such as ClosedInterval.

I think this is the core issue. eltype is part of the iteration interface and continuous sets don't have iteration. Perhaps we need a new function like element_type for continuous sets. Or maybe people should just use Random.gentype.

zsunberg avatar Aug 18 '22 15:08 zsunberg

I'm not sure we need eltype function for non-iterable objects such as ClosedInterval.

I think this is the core issue. eltype is part of the iteration interface. Perhaps we need a new function like element_type for continuous sets. Or maybe people should just use Random.gentype.

I somewhat agree, but there is a dual issue to this one in which people discuss why element_type is not just called eltype :-) In practice, the worst offender really is just 1.5 in 1..2, because it is so counterintuitive to have eltype(1..2) be Int.

daanhb avatar Aug 18 '22 15:08 daanhb

How about Rational? I think this should not be converted with float.

Agreed. How about if the bounds are <: Integer, call float.

zsunberg avatar Aug 18 '22 15:08 zsunberg

For some perspective, in DomainSets we settled on the following. A Domain{T}, which is the supertype of Interval{T}, has eltype T. The statement x::S in d::Domain{T} can be true if:

  • (i) S and T promote to a type U,
  • (ii) x can be converted to U and
  • (iii) d can be converted to Domain{U}.

That is more or less what happens here, as the inequality a <= x would typically promote a and x to U anyway. That scheme seems to work well. The counterintuitive aspect, with respect to the name eltype, is (iii). That is where an Interval{Int} might behave like an Interval{Float64}. In other circumstances it could also behave like Interval{BigFloat} or even something else. That does not mean it does not have an element type, it just possibly has multiple ones and there will never be a single right choice.

daanhb avatar Aug 18 '22 15:08 daanhb

What about boundary_eltype ?

dlfivefifty avatar Aug 18 '22 16:08 dlfivefifty

Agreed. How about if the bounds are <: Integer, call float.

We still have counterexamples HalfInteger and N0f8. :grin: We don't have an abstract type for a dense subset of Real, so we don't have a proper way to check the given type should be converted to float.

What about boundary_eltype ?

Or boundstype for short? (https://github.com/JuliaMath/IntervalSets.jl/issues/115#issuecomment-1219278427)

hyrodium avatar Aug 18 '22 16:08 hyrodium

We still have counterexamples HalfInteger and N0f8

Those are not Integers (unless I am mistaken). Keep in mind that this is only for the syntactic sugar of ... You could always construct Interval(1,2) manually to avoid any conversion.

zsunberg avatar Aug 18 '22 16:08 zsunberg

I don't think that, as a user of IntervalSets, I would ever care for the type of the endpoints of the domain. I want to know what the elements are like - it is a set after all - and the first thing I'd do is abuse boundstype for that purpose :-)

daanhb avatar Aug 18 '22 16:08 daanhb

On second thought, something like boundstype probably does make sense for this package - at least it is well defined. In that case, if an eltype is to remain (it seems entirely unused within IntervalSets itself...) perhaps it is more free to differ from T in cases where it is convenient.

daanhb avatar Aug 18 '22 18:08 daanhb

Note replacing 1..2 with float(1)..float(2) is a very bad idea because it changes the definition of the set:

julia> b = typemax(Int); a = b-1;

julia> a..b
9223372036854775806..9223372036854775807

julia> float(a)..float(b)
9.223372036854776e18..9.223372036854776e18

julia> a in (float(a)..float(b))
false

ApproxFun has something called prectype which is the only place eltype is usedL:

https://github.com/JuliaApproximation/ApproxFunBase.jl/blob/7f435e277c1bd0d8222e6defd51ddbb7e0df3e53/src/Domain.jl#L12

To me the T in Domain{T} is dictating the "precision" of the definition of a set, in the case of the interval the precision of the endpoints. Which ApproxFun then uses to assume the precision of functions on the set.

dlfivefifty avatar Aug 18 '22 19:08 dlfivefifty

(sidenote, DomainSets also has a prectype, might it be the same? I probably did copy the name from ApproxFun. It is defined in general here and for domains here. There is also a numtype with a similar role, the basis numeric type used in T.)

daanhb avatar Aug 18 '22 19:08 daanhb

I think it would make a lot of sense to introduce boundstype, and, if eltype is defined, it should be the type that would naturally be returned if one were to write an iterator. (which we know from @dlfivefifty's comment above would be Float64 :smile: )

zsunberg avatar Aug 18 '22 23:08 zsunberg

I want to know what the elements are like - it is a set after all - and the first thing I'd do is abuse boundstype for that purpose :-)

In what practical situations is the interval "eltype" used?

This has caused confusion in JuliaReinforcementLearning/CommonRLSpaces.jl#12.

This example is just around tests, and it can be replaced with

@test eltype(tp) <: Tuple{Real, Real}

hyrodium avatar Aug 18 '22 23:08 hyrodium

In what practical situations is the interval "eltype" used?

Whenever an algorithm needs to deal with elements of some set but is only provided with the set itself, i.e.

function f(set)
    d = Dict{eltype(set), Float64}()
    # fill up d
end

A concrete example is in reinforcement learning, a user might specify the action space as -1..1, and then the algorithm needs to deal with actions. For instance, here, I infer the action type based on the action set: https://github.com/JuliaPOMDP/QuickPOMDPs.jl/blob/master/src/quick.jl#L166 (sorry, that code is pretty hard to read out of context with _call)

It is true that most of the time you can avoid needing eltype, but it sometimes requires alot more thinking.

zsunberg avatar Aug 19 '22 00:08 zsunberg

(which we know from @dlfivefifty's https://github.com/JuliaMath/IntervalSets.jl/issues/115#issuecomment-1219296524 would be Float64 smile )

Oh shoot, I misinterpreted the comment above, so it wouldn't necessarily be Float64. But I think Float64 would be better than Int, because in my use case above, every element of 1..2 can be converted to a Float64, but convert(Int, 1.5) will error.

zsunberg avatar Aug 19 '22 00:08 zsunberg

Note replacing 1..2 with float(1)..float(2) is a very bad idea because it changes the definition of the set:

This is a fair point. However, through promotion, the package currently does similar conversions anyway:

julia> (1..2) ∪ (1.5..2.5)
1.0..2.5

And Base does it too for ranges between integers, if it is clear that elements will be reals:

julia> range(0, 1, length=100)
0.0:0.010101010101010102:1.0

julia> typeof(ans)
StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}

daanhb avatar Aug 19 '22 05:08 daanhb

In what practical situations is the interval "eltype" used?

As @dlfivefifty also mentioned, one important goal is to convey the expected numerical accuracy (which leads to prectype). When working with domains in function approximation, I typically want to discretize them. In order to pre-allocate memory for that, I would like to know the type of the grid points. Finally, in more complicated domains of DomainSets, the T also conveys information about the structure (of product domains, for example, where T might be a vector or a tuple).

To answer the original question of how this issue affects DomainSets: I think it doesn't. We just define T to be the eltype of Domain{T} anyway, which would cover intervals as a special case even if it is removed here. Based on the arguments here I am not currently inclined to change it (though I'm always up for a nice conceptual debate of course).

I think a counter question might be in which kind of situations it matters that 1..2 has integer endpoints. If I wanted an interval of integers, I would use 1:2.

daanhb avatar Aug 19 '22 06:08 daanhb

But I think Float64 would be better than Int, because in my use case above, every element of 1..2 can be converted to a Float64

But in my example above b = typemax(Int); a = b - 1; a..b converting to Float64 gives only a single element...

dlfivefifty avatar Aug 19 '22 07:08 dlfivefifty

But in my example above b = typemax(Int); a = b - 1; a..b converting to Float64 gives only a single element...

Well, if you construct a set like that, maybe you deserve to get only one element :stuck_out_tongue_winking_eye: ! Just kidding.

zsunberg avatar Aug 19 '22 18:08 zsunberg

I know you are just kidding (and so am I!) but it is important to be consistent: arbitrarily converting to floats will just add unnecessary confusion because at some point someone will want an interval containing large ints.

I'm wondering if you would be better off just calling float(eltype(d)) in the offending code that triggered the issue?

dlfivefifty avatar Aug 19 '22 18:08 dlfivefifty