IntervalSets.jl icon indicating copy to clipboard operation
IntervalSets.jl copied to clipboard

Let the element type and the endpoint types be different

Open zsunberg opened this issue 3 years ago • 10 comments

The rough conclusion that we reached in #117 is that the endpoint types should not necessarily be the same as the eltype.

I decided to implement this in a PR that I will submit shortly, but am filing this issue for conceptual discussion. The key lines that summarize the new implementation are:

struct Interval{L,R,T,TL,TR}  <: TypedEndpointsInterval{L,R,T}
    left::TL
    right::TR
end

There are some important choices I made with my implementation:

  1. I pushed the endpoint types as far down in the type hierarchy as possible, i.e. only the concrete type Interval has endpoint types. There are no abstract types with endpoint type parameters. The type of the endpoint can be accessed with e.g. typeof(leftendpoint(i)). This seemed to be the simplest and lowest-risk option, and we can change it later if necessary.
  2. One choice is what the default eltype should be if the user doesn't specify it explicitly. I originally tried to keep the eltype as close to the endpoint types except for Integers (e.g. eltype(1..3//2) would be Rational, eltype(1..2) would be Float64), but this ended up feeling too special-case-y. I decided a better option would be to call float to determine the eltype from all endpoint types <:Number in the Interval constructor (i.e. eltype(1..3//2) is Float64). This ended up feeling very nice and consistent and works well with other packages like Unitful.
  3. I also added a docstring defining the meaning of eltype for a Domain. It is "the type that best represents elements of the domain according to the criteria chosen by the programmer who created the domain." I think it is best to give users flexibility since it is hard to predict everything Domains will be used for.

What does everyone think?

zsunberg avatar Sep 28 '22 01:09 zsunberg

Implementation is in #122

zsunberg avatar Sep 28 '22 01:09 zsunberg

I still think the eltype(::Interval) method should be removed. Because the use of Base.eltype is intended for iterable objects (see documentation), and the method eltype(::Interval) can be considered as "Type III piracy - Extending with your own type, but for a different purpose". See Hands-On Design Patterns and Best Practices with Julia for more information.

Instead of adding the method to eltype, we can define a new function eltype_interval(::Interval). But in my understanding, that does not require the additional type parameter to Interval type.

julia> eltype_interval(::TypedEndpointsInterval{L,R,T}) where {L,R,T} = T
eltype_interval (generic function with 1 method)

julia> eltype_interval(::TypedEndpointsInterval{L,R,T}) where {L,R,T<:Number} = float(T)
eltype_interval (generic function with 2 methods)

julia> eltype_interval(1..2)
Float64

julia> eltype_interval(1..big(2))
BigFloat

julia> eltype_interval(1..1//2)
Float64

hyrodium avatar Sep 28 '22 08:09 hyrodium

I decided a better option would be to call float on all endpoint types <:Number in the Interval constructor (i.e. eltype(1..3//2) is Float64). This ended up feeling very nice and consistent and works well with other packages like Unitful.

But... Why? The user writing 1..3//2 and not 1..3/2 looks like a very explicit request for the bounds to be exact rationals, not floats. Float conversion would also break current correct behavior in

julia> 10^16+1 ∈ (10^16+1)..10^17
true

julia> 10^16 ∈ (10^16+1)..10^17
false

Maybe, just promoting two endpoints to the same type is best, as done basically everywhere in julia?

aplavin avatar Sep 28 '22 09:09 aplavin

I still think the eltype(::Interval) method should be removed. Because the use of Base.eltype is intended for iterable objects (see documentation), and the method eltype(::Interval) can be considered as "Type III piracy - Extending with your own type, but for a different purpose".

Thanks for bringing this up! It is very important to be wary of. I think that it is a judgement call because if these sets were iterable, they would certainly return objects of type eltype. So, even if it does not strictly match the documented meaning, it is still not inconsistent with the documented meaning. I come down strongly on the side of using eltype since it will allow people (me!) to write generic code to work with many types of sets including intervals.

If we do not want to use eltype, I suggest we have element_type since we will want to have a function that works on all sets.

zsunberg avatar Sep 28 '22 19:09 zsunberg

But... Why? The user writing 1..3//2 and not 1..3/2 looks like a very explicit request for the bounds to be exact rationals, not floats.

Yes! Absolutely! The bounds are Rational, but the eltype (which does not have to be the same as the bounds) is Float64. Users can have a Rational eltype if they want with ClosedInterval{Rational{Int}}(1, 3//2).

zsunberg avatar Sep 28 '22 20:09 zsunberg

Float conversion would also break current correct behavior in

julia> 10^16+1 ∈ (10^16+1)..10^17
true

julia> 10^16 ∈ (10^16+1)..10^17
false

Maybe, just promoting two endpoints to the same type is best, as done basically everywhere in julia?

I think there may be some confusion about what I am proposing. I have added a code snippet to the original post above to clarify. I want to give the user maximum flexibility about what the endpoint types are. The behavior above would be preserved!

zsunberg avatar Sep 28 '22 20:09 zsunberg

@hyrodium regarding the type piracy issue, I do not think using eltype is that much different than using in. From the docstring for in:

Determine whether an item is in the given collection, in the sense that it is == to one of the values generated by iterating over the collection.

So the documented definition of in also involves iteration and ==. Our definition of in is thus different, but everyone thinks we should use in. I also think we should use eltype.

zsunberg avatar Sep 28 '22 20:09 zsunberg

@zsunberg Sorry, I overlooked this comment.

regarding the type piracy issue, I do not think using eltype is that much different than using in. From the docstring for in:

I understand your point, but there are big differences between in and eltype:

  • We do not have a proper definition for eltype. (https://github.com/JuliaMath/IntervalSets.jl/issues/115#issuecomment-1219229766)
    • But in(, ::AbstractInterval) is well-defined.
  • We do not have a practical use case that really needs eltype.
    • In this case, please just define a wrapper function in your package. I believe IntervalSets.jl is not responsible for such use cases.

I am planning to replace eltype with boundstype in IntervalSets.jl. If you have any thoughts, please comment in #150.

hyrodium avatar Jul 30 '23 00:07 hyrodium

@hyrodium thanks for responding! Though it is not exactly what I would have chosen (#122 is my suggestion), I think replacing eltype with boundstype is a reasonable solution.

zsunberg avatar Jul 31 '23 17:07 zsunberg

How about having something like

struct EndPoint{LR, OC, T}
    x::T
end

const Interval{L,R,T} = Pair{EndPoint{:L,L,T}, EndPoint{:R,R,T}}

where LR ∈ (:L, :R), OC ∈ (:open, :closed) and T is eltype. All sorts of interval interfaces can base on endpoint interfaces.

On a second thought, EndPoint can effectively serve as one-sided unbounded interval, and a bounded interval is a lazy intersection of two unbounded intervals.

putianyi889 avatar Sep 13 '23 17:09 putianyi889