MeasureBase.jl Use DomainSets.jl?

Our domains.jl is currently kind of hacked together. When I wondered about replacing this with ManifoldsBase, @gdalle reminded me about DomainSets.jl:

Just popping by to mention https://github.com/JuliaApproximation/DomainSets.jl as part of the discussion of AbstractDomains.

Originally posted by @gdalle in https://github.com/cscherrer/MeasureBase.jl/issues/7#issuecomment-906761492

Funnily enough, we had some brief discussion about this three years ago here.

There are a few places domains come up, though we don't currently have a uniform interface for managing them

For determining a bijection to a real vector space
To anticipate error conditions in logdensity and replace these cases with -Inf
With ⊆, for ≪ computations (nested supports are necessary for absolute continuity)
With ∩ for supports of pointwise product (⊙)
With ∪ for superposition (+)

Some potential concerns:

Can this play nice with ManifoldMeasures? (cc @sethaxen)
DomainSets currently depends on StaticArrays, which is a heavy dependency for a core library like this. Would it be possible to refactor DomainSets somewhat to use Static.jl instead, possibly with an optional StaticArrays dependency? Or maybe there's a better approach? (This part should probably be a new issue in DomainsSets, cc @daanhb @dlfivefifty)

Sep 06 '21 14:09 cscherrer

:point_up: :confused: Well, that didn't work

Sep 06 '21 14:09 cscherrer

AFAIC, the main benefit of using domains for me was being able to integrate a measure over a set. This can be useful to normalize it, or to simulate Poisson processes for instance.

Sep 06 '21 15:09 gdalle

Just pointing out DomainIntegrals.jl here, a package for (numerically) evaluating integrals over domains, possibly with respect to a measure. The included measures are heavily biased towards approximation theory though, e.g. weight functions for orthogonal polynomials, and not very well developed. DomainSets itself also has some of that bias, to be honest.

Sep 06 '21 15:09 daanhb

Thanks @daanhb , DomainIntegrals looks really nice!

Off hand, challenges to address are

Keep MeasureBase dependencies relatively small
But inclusive enough to give a consistent feel to the ecosystem
Play nicely with packages like Manifolds.jl

We'll also need to get to things like measures on function spaces, so other packages from JuliaApproximation could well come into play.

From what I'm seeing so far, it seems like it could make sense to move toward having DomainSets as a dependency for MeasureBase. The fact that it's easily extensible should give us lots of flexibility. The main interest is not so much for its included measures (though that could be interesting as well) but as a way to describe supports of measures we build.

We could consider adding DomainIntegrals as a dependency for MeasureTheory (not MeasureBase), though we should first look at advantages of this over just keeping it separate. I mean, if we start with DomainSets, it might be easy enough for DomainIntegrals to "just work" without the need to have it as a dependency. Then packages like @gdalle is building could still take advantage of it.

Sep 06 '21 16:09 cscherrer

That sounds good. Small comment: measures are only defined in DomainIntegrals.jl, not in DomainSets. At some point they'll probably move to a package that is yet-to-be-named, or even better, be replaced with an existing package.

Sep 06 '21 18:09 daanhb

Simple proof of concept here: https://github.com/cscherrer/MeasureBase.jl/pull/19

There's really not much to it, more interesting will be to see how this change would affect MeasureTheory.jl

Sep 07 '21 17:09 cscherrer

This is really helpful (thanks @daanhb ): https://github.com/JuliaApproximation/DomainSets.jl/issues/91#issuecomment-934146443

So one possibility would be for a Measure to be parameterized by a Domain. It would be really nice to not have to worry about domains, and to have a clean connection to the JuliaApproximation tools. My biggest concern at this point is with how this could affect @sethaxen's work on https://github.com/JuliaManifolds/ManifoldMeasures.jl

Daan or Seth (or others), what do you think?

Oct 05 '21 15:10 cscherrer

So one possibility would be for a Measure to be parameterized by a Domain. It would be really nice to not have to worry about domains, and to have a clean connection to the JuliaApproximation tools. My biggest concern at this point is with how this could affect @sethaxen's work on https://github.com/JuliaManifolds/ManifoldMeasures.jl

I haven't really been following this discussion, but this is similar to the solution I have been using for ManifoldMeasures.jl. There, each measure is associated with a manifold. Because some measures can be written generically for multiple manifolds (e.g. VonMisesFisher), many of the measures actually store a manifold object, but this is not currently part of the API; instead, we overload the Manifolds.base_manifold method from Manifolds to retrieve the manifold.

If one wanted to do the same thing with MeasureTheory but using Domains, this would be fine unless core functionality of MeasureTheory unnecessarily relied on a domain being defined as as a subtype of some type in DomainSets. An option is to now have a type constraint, which could allow swapping a Manifold as a "domain".

Oct 06 '21 09:10 sethaxen

If one wanted to do the same thing with MeasureTheory but using Domains, this would be fine unless core functionality of MeasureTheory unnecessarily relied on a domain being defined as as a subtype of some type in DomainSets. An option is to now have a type constraint, which could allow swapping a Manifold as a "domain".

If I'm understanding correctly, this should be ok as long as every Manifold has in andeltype methods. Here's from the bottom of the DomainSets README:

The domain interface

A domain is any type that implements the functions eltype and in. If d is an instance of a type that implements the domain interface, then the domain consists of all x that is an eltype(d) such that x in d returns true.

Domains often represent continuous mathematical domains, for example, a domain d representing the interval [0,1] would have eltype(d) == Int but still have 0.2 in d return true.

The Domain type

DomainSets.jl contains an abstract type Domain{T}. All subtypes of Domain{T} must implement the domain interface, and in addition support convert(Domain{T}, d).

DomainSets seems to have very nice handling of unions and products. So I think we can have some simple laws like

the domain of a ProductMeasure is the ProductDomain of its components

and

the domain of a SuperpositionMeasure is the union of its components.

Other interesting things:

DomainIntegrals includes some measures, so hopefully we can interface cleanly with this
Lots of the JuliaApproximation packages deal with infinite arrays and function spaces. I'm not sure yet about how these connect to DomainSets, but we'll definitely want measures on spaces like these

Oct 06 '21 13:10 cscherrer

Just to confirm, we like to avoid requiring domains be a subtype of Domain. This is not always convenient or possible, but it does lead to flexibility where it works. Even a vector is a domain, because it defines membership. I'd support a design in which the Domain type is not explicitly assumed, only something that behaves like a domain. I did not yet look at the manifolds being mentioned here.

Oct 06 '21 13:10 daanhb

If I'm understanding correctly, this should be ok as long as every Manifold has in andeltype methods. Here's from the bottom of the DomainSets README:

While all Manifolds have an in method, they do not have an eltype method by design, since there are many potential representations for points on a manifold, including different array types one might want to use as well as different coordinate systems. We of course have a documented "default representation," but even that is never type-constrained, it's simply assumed to be the argument unless one adds a custom method for a different representation.

In ManifoldMeasures, for sampling, we have had to make this more explicit by defining a default_point, but that is just to generate a point in rand that is overwritten by rand!. One can pass a different point type to rand! if they like.

Oct 06 '21 13:10 sethaxen

Probably worth noting again that Distributions.jl has very weird eltype behavior:

julia> using Distributions

julia> eltype(Normal())
Float64

julia> eltype(MvNormal(ones(3)))
Float64

This doesn't compose well, but it's probably much to late to fix. I had thought this meant we should avoid eltype (this led us to sampletype), but maybe we should have it after all? It seems easy enough to have it be abstract, for example

julia> eltype(AbstractArray{AbstractVector{Real}})
AbstractVector{Real} (alias for AbstractArray{Real, 1})

Also, there's a default method:

julia> abstract type T end

julia> eltype(T)
Any

So in the DomainSets requirement,

the domain consists of all x that is an eltype(d) such that x in d returns true.

I guess this just becomes a universal quantifier?

Oct 06 '21 13:10 cscherrer

In our uses of DomainSets for function approximation it is useful to have a T in Domain{T}. The T indicates the expected precision with which to do computations, if it is Float64 or an array of Float64 or something else indicative of a floating point precision. That's why we have it. But it becomes impractical to enforce an eltype too much. We do have eltype in the definition of the interface of a domain, but it becomes very nuanced to talk about what the T in Domain{T} actually means. Perhaps the documentation is even not entirely correct. The issue is (encouragingly) reminiscent of what @sethaxen writes above.

The simplest examples are intervals. The closed interval [a,b] contains all x for which a <= x <= b, regardless of what the types are. To the extent that it is feasible in practice, domains are oblivious to what T is and behave like their mathematical definition:

julia> eltype(Interval(0,1))
Int64

julia> eltype(Interval(0.0,1.0))
Float64

julia> 0.5 ∈ Interval(0,1)
true

julia> Interval(0,1) == Interval(0.0,1.0)
true

It is a bit like the standard Julia behaviour of membership for arrays:

julia> 2 ∈ [1.0, 2.0, 3.0]
true

julia> 2.0 ∈ [1, 2, 3]
true

julia> [1,2,3] == [1.0, 2.0, 3.0]
true

I guess this makes the T in Domain{T} a bit like the "default representation". What the domain actually is, is decided by what in returns.

Being convinced over time that this notion of Domain{T} was workable, I also defined a Measure{T} supertype, the idea being that the support of a Measure{T} is a Domain{T} (or something that acts like a Domain{T}) with a similar loose definition of T.

Oct 06 '21 14:10 daanhb

So Manifolds.jl is well developed I see. Also, interoperability by design is fun:

julia> using DomainSets, Manifolds

julia> M1 = Manifolds.Sphere(2)
Sphere(2, ℝ)

julia> M2 = Manifolds.Circle()
Circle(ℝ)

julia> d = uniondomain(M1, M2)
Sphere(2, ℝ) ∪ Circle(ℝ)

julia> 0.5 ∈ d
true

The example doesn't necessarily make sense, but it works :-) (Well, to be honest, it doesn't with the current release of DomainSets, I had to remove a hidden assumption about domains.)

Oct 06 '21 15:10 daanhb

The T indicates the expected precision with which to do computations, if it is Float64 or an array of Float64 or something else indicative of a floating point precision.

You had me worried for a minute, since this sounded a little like the weird eltype behavior in Distributions. But your eltype seems much better behaved:

julia> d1 = ProductDomain(map(Interval, rand(10), rand(10)))
D₁ × D₃ × D₂ × ... × D₄

D₁ = 0.04102775113924917..0.7543070839059667
D₂ = 0.0301860782231258..0.3334968805199827
D₃ = 0.1203453376504664..0.31294581525887777
D₄ = 0.8943245774798796..0.021646286220368793

julia> typeof(d1)
Rectangle{Vector{Float64}}

julia> eltype(d1)
Vector{Float64} (alias for Array{Float64, 1})

This is a little bit of a tangent, but things like eltype and type parameters have been kind of a headache for us, so I think it will be helpful for me to see how you've addressed these things.

Oct 06 '21 15:10 cscherrer

Just reviving this discussion to mention LazySets.jl, which mostly contains lazy implementations of convex polyhedra, but some of its functions work for general sets.

Nov 12 '21 07:11 gdalle

Other related discussions:

https://github.com/cscherrer/MeasureTheory.jl/issues/128 and the associated PR https://github.com/cscherrer/MeasureTheory.jl/pull/130
https://github.com/cscherrer/MeasureBase.jl/pull/11

Nov 12 '21 07:11 gdalle

Use DomainSets.jl?

The domain interface

The Domain type

The `Domain` type