Use DomainSets.jl?
Our domains.jl is currently kind of hacked together. When I wondered about replacing this with ManifoldsBase, @gdalle reminded me about DomainSets.jl:
Just popping by to mention https://github.com/JuliaApproximation/DomainSets.jl as part of the discussion of
AbstractDomains.Originally posted by @gdalle in https://github.com/cscherrer/MeasureBase.jl/issues/7#issuecomment-906761492
Funnily enough, we had some brief discussion about this three years ago here.
There are a few places domains come up, though we don't currently have a uniform interface for managing them
- For determining a bijection to a real vector space
- To anticipate error conditions in
logdensityand replace these cases with-Inf - With
⊆, for≪computations (nested supports are necessary for absolute continuity) - With
∩for supports of pointwise product (⊙) - With
∪for superposition (+)
Some potential concerns:
- Can this play nice with ManifoldMeasures? (cc @sethaxen)
- DomainSets currently depends on
StaticArrays, which is a heavy dependency for a core library like this. Would it be possible to refactor DomainSets somewhat to use Static.jl instead, possibly with an optional StaticArrays dependency? Or maybe there's a better approach? (This part should probably be a new issue in DomainsSets, cc @daanhb @dlfivefifty)
:point_up: :confused: Well, that didn't work
AFAIC, the main benefit of using domains for me was being able to integrate a measure over a set. This can be useful to normalize it, or to simulate Poisson processes for instance.
Just pointing out DomainIntegrals.jl here, a package for (numerically) evaluating integrals over domains, possibly with respect to a measure. The included measures are heavily biased towards approximation theory though, e.g. weight functions for orthogonal polynomials, and not very well developed. DomainSets itself also has some of that bias, to be honest.
Thanks @daanhb , DomainIntegrals looks really nice!
Off hand, challenges to address are
- Keep MeasureBase dependencies relatively small
- But inclusive enough to give a consistent feel to the ecosystem
- Play nicely with packages like Manifolds.jl
We'll also need to get to things like measures on function spaces, so other packages from JuliaApproximation could well come into play.
From what I'm seeing so far, it seems like it could make sense to move toward having DomainSets as a dependency for MeasureBase. The fact that it's easily extensible should give us lots of flexibility. The main interest is not so much for its included measures (though that could be interesting as well) but as a way to describe supports of measures we build.
We could consider adding DomainIntegrals as a dependency for MeasureTheory (not MeasureBase), though we should first look at advantages of this over just keeping it separate. I mean, if we start with DomainSets, it might be easy enough for DomainIntegrals to "just work" without the need to have it as a dependency. Then packages like @gdalle is building could still take advantage of it.
That sounds good. Small comment: measures are only defined in DomainIntegrals.jl, not in DomainSets. At some point they'll probably move to a package that is yet-to-be-named, or even better, be replaced with an existing package.
Simple proof of concept here: https://github.com/cscherrer/MeasureBase.jl/pull/19
There's really not much to it, more interesting will be to see how this change would affect MeasureTheory.jl
This is really helpful (thanks @daanhb ): https://github.com/JuliaApproximation/DomainSets.jl/issues/91#issuecomment-934146443
So one possibility would be for a Measure to be parameterized by a Domain. It would be really nice to not have to worry about domains, and to have a clean connection to the JuliaApproximation tools. My biggest concern at this point is with how this could affect @sethaxen's work on https://github.com/JuliaManifolds/ManifoldMeasures.jl
Daan or Seth (or others), what do you think?
So one possibility would be for a Measure to be parameterized by a Domain. It would be really nice to not have to worry about domains, and to have a clean connection to the JuliaApproximation tools. My biggest concern at this point is with how this could affect @sethaxen's work on https://github.com/JuliaManifolds/ManifoldMeasures.jl
I haven't really been following this discussion, but this is similar to the solution I have been using for ManifoldMeasures.jl. There, each measure is associated with a manifold. Because some measures can be written generically for multiple manifolds (e.g. VonMisesFisher), many of the measures actually store a manifold object, but this is not currently part of the API; instead, we overload the Manifolds.base_manifold method from Manifolds to retrieve the manifold.
If one wanted to do the same thing with MeasureTheory but using Domains, this would be fine unless core functionality of MeasureTheory unnecessarily relied on a domain being defined as as a subtype of some type in DomainSets. An option is to now have a type constraint, which could allow swapping a Manifold as a "domain".
If one wanted to do the same thing with MeasureTheory but using Domains, this would be fine unless core functionality of MeasureTheory unnecessarily relied on a domain being defined as as a subtype of some type in DomainSets. An option is to now have a type constraint, which could allow swapping a
Manifoldas a "domain".
If I'm understanding correctly, this should be ok as long as every Manifold has in andeltype methods. Here's from the bottom of the DomainSets README:
The domain interface
A domain is any type that implements the functions
eltypeandin. Ifdis an instance of a type that implements the domain interface, then the domain consists of allxthat is aneltype(d)such thatx in dreturns true.Domains often represent continuous mathematical domains, for example, a domain
drepresenting the interval[0,1]would haveeltype(d) == Intbut still have0.2 in dreturn true.The
DomaintypeDomainSets.jl contains an abstract type
Domain{T}. All subtypes ofDomain{T}must implement the domain interface, and in addition supportconvert(Domain{T}, d).
DomainSets seems to have very nice handling of unions and products. So I think we can have some simple laws like
the domain of a ProductMeasure is the ProductDomain of its components
and
the domain of a SuperpositionMeasure is the union of its components.
Other interesting things:
- DomainIntegrals includes some measures, so hopefully we can interface cleanly with this
- Lots of the JuliaApproximation packages deal with infinite arrays and function spaces. I'm not sure yet about how these connect to DomainSets, but we'll definitely want measures on spaces like these
Just to confirm, we like to avoid requiring domains be a subtype of Domain. This is not always convenient or possible, but it does lead to flexibility where it works. Even a vector is a domain, because it defines membership. I'd support a design in which the Domain type is not explicitly assumed, only something that behaves like a domain. I did not yet look at the manifolds being mentioned here.
If I'm understanding correctly, this should be ok as long as every
Manifoldhasinandeltypemethods. Here's from the bottom of the DomainSets README:
While all Manifolds have an in method, they do not have an eltype method by design, since there are many potential representations for points on a manifold, including different array types one might want to use as well as different coordinate systems. We of course have a documented "default representation," but even that is never type-constrained, it's simply assumed to be the argument unless one adds a custom method for a different representation.
In ManifoldMeasures, for sampling, we have had to make this more explicit by defining a default_point, but that is just to generate a point in rand that is overwritten by rand!. One can pass a different point type to rand! if they like.
Probably worth noting again that Distributions.jl has very weird eltype behavior:
julia> using Distributions
julia> eltype(Normal())
Float64
julia> eltype(MvNormal(ones(3)))
Float64
This doesn't compose well, but it's probably much to late to fix. I had thought this meant we should avoid eltype (this led us to sampletype), but maybe we should have it after all? It seems easy enough to have it be abstract, for example
julia> eltype(AbstractArray{AbstractVector{Real}})
AbstractVector{Real} (alias for AbstractArray{Real, 1})
Also, there's a default method:
julia> abstract type T end
julia> eltype(T)
Any
So in the DomainSets requirement,
the domain consists of all
xthat is aneltype(d)such thatx in dreturnstrue.
I guess this just becomes a universal quantifier?
In our uses of DomainSets for function approximation it is useful to have a T in Domain{T}. The T indicates the expected precision with which to do computations, if it is Float64 or an array of Float64 or something else indicative of a floating point precision. That's why we have it. But it becomes impractical to enforce an eltype too much. We do have eltype in the definition of the interface of a domain, but it becomes very nuanced to talk about what the T in Domain{T} actually means. Perhaps the documentation is even not entirely correct. The issue is (encouragingly) reminiscent of what @sethaxen writes above.
The simplest examples are intervals. The closed interval [a,b] contains all x for which a <= x <= b, regardless of what the types are. To the extent that it is feasible in practice, domains are oblivious to what T is and behave like their mathematical definition:
julia> eltype(Interval(0,1))
Int64
julia> eltype(Interval(0.0,1.0))
Float64
julia> 0.5 ∈ Interval(0,1)
true
julia> Interval(0,1) == Interval(0.0,1.0)
true
It is a bit like the standard Julia behaviour of membership for arrays:
julia> 2 ∈ [1.0, 2.0, 3.0]
true
julia> 2.0 ∈ [1, 2, 3]
true
julia> [1,2,3] == [1.0, 2.0, 3.0]
true
I guess this makes the T in Domain{T} a bit like the "default representation". What the domain actually is, is decided by what in returns.
Being convinced over time that this notion of Domain{T} was workable, I also defined a Measure{T} supertype, the idea being that the support of a Measure{T} is a Domain{T} (or something that acts like a Domain{T}) with a similar loose definition of T.
So Manifolds.jl is well developed I see. Also, interoperability by design is fun:
julia> using DomainSets, Manifolds
julia> M1 = Manifolds.Sphere(2)
Sphere(2, ℝ)
julia> M2 = Manifolds.Circle()
Circle(ℝ)
julia> d = uniondomain(M1, M2)
Sphere(2, ℝ) ∪ Circle(ℝ)
julia> 0.5 ∈ d
true
The example doesn't necessarily make sense, but it works :-) (Well, to be honest, it doesn't with the current release of DomainSets, I had to remove a hidden assumption about domains.)
The
Tindicates the expected precision with which to do computations, if it isFloat64or an array ofFloat64or something else indicative of a floating point precision.
You had me worried for a minute, since this sounded a little like the weird eltype behavior in Distributions. But your eltype seems much better behaved:
julia> d1 = ProductDomain(map(Interval, rand(10), rand(10)))
D₁ × D₃ × D₂ × ... × D₄
D₁ = 0.04102775113924917..0.7543070839059667
D₂ = 0.0301860782231258..0.3334968805199827
D₃ = 0.1203453376504664..0.31294581525887777
D₄ = 0.8943245774798796..0.021646286220368793
julia> typeof(d1)
Rectangle{Vector{Float64}}
julia> eltype(d1)
Vector{Float64} (alias for Array{Float64, 1})
This is a little bit of a tangent, but things like eltype and type parameters have been kind of a headache for us, so I think it will be helpful for me to see how you've addressed these things.
Just reviving this discussion to mention LazySets.jl, which mostly contains lazy implementations of convex polyhedra, but some of its functions work for general sets.
Other related discussions:
- https://github.com/cscherrer/MeasureTheory.jl/issues/128 and the associated PR https://github.com/cscherrer/MeasureTheory.jl/pull/130
- https://github.com/cscherrer/MeasureBase.jl/pull/11