IntervalSets.jl
IntervalSets.jl copied to clipboard
Should eltype(1..2) be Float64? (currently it is Int)
It seems quite strange to me that eltype(1..2) is Int because the elements in the mathematical set [1,2] are real numbers, and real numbers are usually represented by Float64.
Happy to make a PR if people support this.
This has caused confusion in JuliaReinforcementLearning/CommonRLSpaces.jl#12.
Why Float64?
julia> big(π) in 0..4
true
I think eltype is just a poorly defined concept here....
I'm not sure we need eltype function for non-iterable objects such as ClosedInterval.
The most consistent way would be rename eltype(::Interval) to boundstype(::Interval), I guess.
I once had a crazy idea of implementing iterate using nextfloat. This would make sense of eltype (so iterate(0..1) would be equivalent to [0,1]). Unfortunately there are a lot of floats so it was a completely useless idea!
So I think removing eltype would be wise. But this also effects DomainSets.jl. @daanhb thoughts?
I think the ship has sailed, up to some details. The choice in this package is that an interval is defined by the membership function a <= x <= b, regardless of what the types of the endpoints and of x are. This is the most sensible mathematical definition of the continuous interval. In developing DomainSets, so far anything that would lead to restrictions on types over mathematical flexibility or correctness has turned out to be a bad idea. So, first of all, I'm happy with an interval defined by integers containing real numbers (and anything else that can be compared to an integer).
Personally, I'm also happy with having eltype. Consider this example:
julia> A = [1,2,3]
3-element Vector{Int64}:
1
2
3
julia> eltype(A)
Int64
julia> 2.0 ∈ A
true
What IntervalSets does is just the continuous analogue of that, largely. For the example above, nothing else would make sense. It would be weird to make the eltype be Float64. It would also be bizarre if 2.0 isn't a member, since after all 2.0 == 2.
Of course it is annoying that 1..2 has eltype set to Int by default, because usually one wants it to be Float64. From a user-friendliness perspective I could imagine changing that default, as long as the construction with Ints also remains possible, given that the focus of this package is really on continuous sets. It is safe to guess every user runs into this issue at some point.
In DomainSets the T in Domain{T} is crucial in many ways and I'd prefer to just call it what it is, the eltype :-)
Perhaps we could have 1..2 turn into 1.0..2.0, because the notation implies the intention of a continuous set, while leaving Interval(1,2) as it is?
Perhaps we could have 1..2 turn into 1.0..2.0, because the notation implies the intention of a continuous set, while leaving Interval(1,2) as it is?
We currently support Date type, so converting Int to Float64 seems inconsistent.
julia> using Dates, IntervalSets
julia> Date(2022,08,12) .. Date(2022,09,12)
2022-08-12..2022-09-12
julia> Date(2022,09,10) in ans
true
We currently support Date type, so converting Int to Float64 seems inconsistent.
One possible rule would be: if the bounds are <: Real, call float on them.
How about Rational? I think this should not be converted with float.
Perhaps we could have 1..2 turn into 1.0..2.0, because the notation implies the intention of a continuous set, while leaving Interval(1,2) as it is?
We currently support
Datetype, so convertingInttoFloat64seems inconsistent.julia> using Dates, IntervalSets julia> Date(2022,08,12) .. Date(2022,09,12) 2022-08-12..2022-09-12 julia> Date(2022,09,10) in ans true
Thanks, that's a cool example.
I'm not sure we need eltype function for non-iterable objects such as ClosedInterval.
I think this is the core issue. eltype is part of the iteration interface and continuous sets don't have iteration. Perhaps we need a new function like element_type for continuous sets. Or maybe people should just use Random.gentype.
I'm not sure we need eltype function for non-iterable objects such as ClosedInterval.
I think this is the core issue.
eltypeis part of the iteration interface. Perhaps we need a new function likeelement_typefor continuous sets. Or maybe people should just useRandom.gentype.
I somewhat agree, but there is a dual issue to this one in which people discuss why element_type is not just called eltype :-)
In practice, the worst offender really is just 1.5 in 1..2, because it is so counterintuitive to have eltype(1..2) be Int.
How about Rational? I think this should not be converted with float.
Agreed. How about if the bounds are <: Integer, call float.
For some perspective, in DomainSets we settled on the following. A Domain{T}, which is the supertype of Interval{T}, has eltype T. The statement x::S in d::Domain{T} can be true if:
- (i)
SandTpromote to a typeU, - (ii)
xcan be converted toUand - (iii)
dcan be converted toDomain{U}.
That is more or less what happens here, as the inequality a <= x would typically promote a and x to U anyway. That scheme seems to work well. The counterintuitive aspect, with respect to the name eltype, is (iii). That is where an Interval{Int} might behave like an Interval{Float64}. In other circumstances it could also behave like Interval{BigFloat} or even something else. That does not mean it does not have an element type, it just possibly has multiple ones and there will never be a single right choice.
What about boundary_eltype ?
Agreed. How about if the bounds are <: Integer, call float.
We still have counterexamples HalfInteger and N0f8. :grin:
We don't have an abstract type for a dense subset of Real, so we don't have a proper way to check the given type should be converted to float.
What about boundary_eltype ?
Or boundstype for short? (https://github.com/JuliaMath/IntervalSets.jl/issues/115#issuecomment-1219278427)
We still have counterexamples HalfInteger and N0f8
Those are not Integers (unless I am mistaken). Keep in mind that this is only for the syntactic sugar of ... You could always construct Interval(1,2) manually to avoid any conversion.
I don't think that, as a user of IntervalSets, I would ever care for the type of the endpoints of the domain. I want to know what the elements are like - it is a set after all - and the first thing I'd do is abuse boundstype for that purpose :-)
On second thought, something like boundstype probably does make sense for this package - at least it is well defined. In that case, if an eltype is to remain (it seems entirely unused within IntervalSets itself...) perhaps it is more free to differ from T in cases where it is convenient.
Note replacing 1..2 with float(1)..float(2) is a very bad idea because it changes the definition of the set:
julia> b = typemax(Int); a = b-1;
julia> a..b
9223372036854775806..9223372036854775807
julia> float(a)..float(b)
9.223372036854776e18..9.223372036854776e18
julia> a in (float(a)..float(b))
false
ApproxFun has something called prectype which is the only place eltype is usedL:
https://github.com/JuliaApproximation/ApproxFunBase.jl/blob/7f435e277c1bd0d8222e6defd51ddbb7e0df3e53/src/Domain.jl#L12
To me the T in Domain{T} is dictating the "precision" of the definition of a set, in the case of the interval the precision of the endpoints. Which ApproxFun then uses to assume the precision of functions on the set.
(sidenote, DomainSets also has a prectype, might it be the same? I probably did copy the name from ApproxFun. It is defined in general here and for domains here. There is also a numtype with a similar role, the basis numeric type used in T.)
I think it would make a lot of sense to introduce boundstype, and, if eltype is defined, it should be the type that would naturally be returned if one were to write an iterator. (which we know from @dlfivefifty's comment above would be Float64 :smile: )
I want to know what the elements are like - it is a set after all - and the first thing I'd do is abuse boundstype for that purpose :-)
In what practical situations is the interval "eltype" used?
This has caused confusion in JuliaReinforcementLearning/CommonRLSpaces.jl#12.
This example is just around tests, and it can be replaced with
@test eltype(tp) <: Tuple{Real, Real}
In what practical situations is the interval "eltype" used?
Whenever an algorithm needs to deal with elements of some set but is only provided with the set itself, i.e.
function f(set)
d = Dict{eltype(set), Float64}()
# fill up d
end
A concrete example is in reinforcement learning, a user might specify the action space as -1..1, and then the algorithm needs to deal with actions. For instance, here, I infer the action type based on the action set: https://github.com/JuliaPOMDP/QuickPOMDPs.jl/blob/master/src/quick.jl#L166 (sorry, that code is pretty hard to read out of context with _call)
It is true that most of the time you can avoid needing eltype, but it sometimes requires alot more thinking.
(which we know from @dlfivefifty's https://github.com/JuliaMath/IntervalSets.jl/issues/115#issuecomment-1219296524 would be Float64 smile )
Oh shoot, I misinterpreted the comment above, so it wouldn't necessarily be Float64. But I think Float64 would be better than Int, because in my use case above, every element of 1..2 can be converted to a Float64, but convert(Int, 1.5) will error.
Note replacing
1..2withfloat(1)..float(2)is a very bad idea because it changes the definition of the set:
This is a fair point. However, through promotion, the package currently does similar conversions anyway:
julia> (1..2) ∪ (1.5..2.5)
1.0..2.5
And Base does it too for ranges between integers, if it is clear that elements will be reals:
julia> range(0, 1, length=100)
0.0:0.010101010101010102:1.0
julia> typeof(ans)
StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}
In what practical situations is the interval "eltype" used?
As @dlfivefifty also mentioned, one important goal is to convey the expected numerical accuracy (which leads to prectype). When working with domains in function approximation, I typically want to discretize them. In order to pre-allocate memory for that, I would like to know the type of the grid points. Finally, in more complicated domains of DomainSets, the T also conveys information about the structure (of product domains, for example, where T might be a vector or a tuple).
To answer the original question of how this issue affects DomainSets: I think it doesn't. We just define T to be the eltype of Domain{T} anyway, which would cover intervals as a special case even if it is removed here. Based on the arguments here I am not currently inclined to change it (though I'm always up for a nice conceptual debate of course).
I think a counter question might be in which kind of situations it matters that 1..2 has integer endpoints. If I wanted an interval of integers, I would use 1:2.
But I think Float64 would be better than Int, because in my use case above, every element of 1..2 can be converted to a Float64
But in my example above b = typemax(Int); a = b - 1; a..b converting to Float64 gives only a single element...
But in my example above b = typemax(Int); a = b - 1; a..b converting to Float64 gives only a single element...
Well, if you construct a set like that, maybe you deserve to get only one element :stuck_out_tongue_winking_eye: ! Just kidding.
I know you are just kidding (and so am I!) but it is important to be consistent: arbitrarily converting to floats will just add unnecessary confusion because at some point someone will want an interval containing large ints.
I'm wondering if you would be better off just calling float(eltype(d)) in the offending code that triggered the issue?