CoordinateTransformations.jl icon indicating copy to clipboard operation
CoordinateTransformations.jl copied to clipboard

Traits for coordinate types

Open juliohm opened this issue 2 years ago • 13 comments

Thank you for this package, it is very clean and useful.

I am trying to generalize the types of coordinates supported in Meshes.jl and noticed that CoordinateTransformations.jl and Geodesy.jl already implement a lot of them 💯 I understand that Cartesian coordinates are modeled as SVector whereas other coordinate types have their own type like Polar, LLA, ... In order to support all these types consistently, it would be nice to have a collection of basic traits that indicate the embedding dimension and machine type used to store the coordinates.

To give more context, we currently use Point everywhere in Meshes.jl and this struct holds the Cartesian coordinates as SVector:

https://github.com/JuliaGeometry/Meshes.jl/blob/2569d9cb2b618c24549ccccbbdb25df7efa1bcd3/src/points.jl#L22-L25

We have two traits embeddim(point) and coordtype(point) to retrieve the dimension and machine type inside algorithms because that is important to pre-allocate memory, define sizes of matrices, etc. Is there a contract somewhere with a list of traits that should be implemented for new coordinate types? Where can I contribute?

Also, I think it would make sense to add a new type Cartesian for the Cartesian coordinates case to avoid type piracy with SVector. Please let me know how I can help with this.

cc: @andyferris @c42f

juliohm avatar Aug 16 '21 21:08 juliohm

Also, I think it would make sense to add a new type Cartesian for the Cartesian coordinates case to avoid type piracy with SVector. Please let me know how I can help with this.

I'm not sure if there is any issue of type piracy - a Cartesian representation of an affine point has, as far as I can tell, a subset of the interface of a vector. So you don't need to define anything inconsistent with the AbstractVector iterface. You are always free to define new functions on StaticVector like coordinates, coordtype, embeddim, etc. In previous work we tended to use StaticVector{N, <:Real} as our point type.

The main advantage of having a new affine type is that it would disallow certain operations (adding points or scaling points) and instead transformations would happen through vectors (you could subtract two points to get a vector, and you could add a points and a vector, etc). This might avoid certain accidental errors at the cost of requiring an additional Point wrapper for the end user in various places. Since someone could always define origin = Point(SVector(0, 0, 0)) you can't fundamentally stop people working in the vector space about the origin. If you think the trade-off is worth it then I don't particularly mind either way.

Is there a contract somewhere with a list of traits that should be implemented for new coordinate types? Where can I contribute?

No, not really. CoordinateTransformations itself is very unopinionated - it just lets you define and apply transformations. If you can define a common basis that other things can be built upon (considering both 3D graphics and GIS needs would be a good start) I suppose that could be useful - for points I'm not sure whether there is anything more than the dimensionality and coordinate type to consider generically?

andyferris avatar Aug 18 '21 03:08 andyferris

I'm not sure if there is any issue of type piracy - a Cartesian representation of an affine point has, as far as I can tell, a subset of the interface of a vector. So you don't need to define anything inconsistent with the AbstractVector iterface. You are always free to define new functions on StaticVector like coordinates, coordtype, embeddim, etc. In previous work we tended to use StaticVector{N, <:Real} as our point type.

I am bit confused, can you clarify a bit further? If the Cartesian coordinates are a subset of the StaticVector interface why we need to add new functions like coordtype, embeddim, etc? Did you mean the opposite maybe? The Cartesian type is a more restrictive type of array with a larger set of functions in its API, which contains the API of the StaticVector, correct? Another benefit of avoiding StaticVector for the Cartesian representation is that Julia won't need to recompile modules every time someone loads a 3rd-party module operating on StaticVector. I remember Makie.jl had issues of compilation time because of these kinds of invalidation, but I am not 100% sure.

The main advantage of having a new affine type is that it would disallow certain operations (adding points or scaling points) and instead transformations would happen through vectors (you could subtract two points to get a vector, and you could add a points and a vector, etc). This might avoid certain accidental errors at the cost of requiring an additional Point wrapper for the end user in various places. Since someone could always define origin = Point(SVector(0, 0, 0)) you can't fundamentally stop people working in the vector space about the origin. If you think the trade-off is worth it then I don't particularly mind either way.

That is precisely what we do in Meshes.Point. We do not want points to inherit all the functionality of StaticVector. The same point can even have different coordinates depending on the coordinate reference system (e.g. a point on the sphere with lat-lon can be represented with x-y-z Cartesian coordinates).

No, not really. CoordinateTransformations itself is very unopinionated - it just lets you define and apply transformations. If you can define a common basis that other things can be built upon (considering both 3D graphics and GIS needs would be a good start) I suppose that could be useful - for points I'm not sure whether there is anything more than the dimensionality and coordinate type to consider generically?

Got it. I will try to work on this in the following months. Having a consistent set of access functions would enable more sophisticated algorithms downstream. For now I also only have embeddim and coordtype in mind. Maybe in the future we will need other kinds of traits to determine the underlying manifold where the coordinates live.

On a related note, I think it could also make sense to merge CoordinateTransformations.jl, Rotations.jl and Geodesy.jl into a single package for dealing with coordinates in general. It is a very general problem that end users face and we could certainly improve their experience by working in a coordinated (pun intended) effort. The current split feels a bit like premature modularization, a new type of issue that I am noticing in Julia ecosystems.

juliohm avatar Aug 18 '21 10:08 juliohm

That is precisely what we do in Meshes.Point. We do not want points to inherit all the functionality of StaticVector.

We thought about this a lot in the past. Some thoughts:

  • In languages with strong static analysis tools (static type systems or otherwise) it helps with correctness. But this benefit is quite watered down in Julia (maybe JET.jl will finally change this?).
  • Point only protects correctness when all Points live in the same affine space. This is likely true in typical geometric computations. But it's not always true in other technical computing work. In geospatial there may be varying datums, for example.
  • Various generic algorithms which are well defined on affine spaces assume vector space input in their implementation. Unfortunately this breaks plenty of things, including simple things like Statistics.mean(). As a result, I've found that trying to pass points (of various types) around tends just to result in breakage and frustration.

I think there might be some benefits for Point in terms of dispatch for graphing libraries and the like. But I kind of wish everyone would just use vector, despite its somewhat "fat" interface! (Alternatively, perhaps we should go all the way and "take geometric algebra seriously" :sweat_smile:)

The current split feels a bit like premature modularization, a new type of issue that I am noticing in Julia ecosystems.

I don't think we should join these. Geodesy in particular is very domain specific, and the APIs of Rotations and CoordinateTransformations are fundamentally different from each other (linear algebra and function composition, respectively).

A high level utility package for end users could perhaps be helpful, however.

c42f avatar Aug 20 '21 03:08 c42f

  • In languages with strong static analysis tools (static type systems or otherwise) it helps with correctness. But this benefit is quite watered down in Julia (maybe JET.jl will finally change this?).

I think it is till beneficial in the sense that developers are forced to work explicitly with these concepts. If everyone uses vectors everywhere it is not clear what concept is being manipulated, and it is very hard to generalize algorithms that way.

  • Point only protects correctness when all Points live in the same affine space. This is likely true in typical geometric computations. But it's not always true in other technical computing work. In geospatial there may be varying datums, for example.

I like to distinguish between the concept of Point and Coordinates. A point is an abstract, coordinate-free concept, which can have multiple different coordinates depending on the coordinate refererence system as you mentioned. In Meshes.jl we chose a Point because we want to take different coordinates out of it with coordinates(point). Right now these coordinates are just Cartesian, but the goal is to write something like coordinates(point, CRS) with custom reference systems and have a default reference system for the project such that coordinates(point) always converts to a common space where the computation happens. This project-wise reference system could be set in an environment variable like ENV["MESHES_PROJ_CRS"] = "Cartesian" or with a helper function like meshes_proj_crs!(Cartesian).

  • Various generic algorithms which are well defined on affine spaces assume vector space input in their implementation. Unfortunately this breaks plenty of things, including simple things like Statistics.mean(). As a result, I've found that trying to pass points (of various types) around tends just to result in breakage and frustration.

We are well aware of this issue and have plans to work with manifold surfaces and algorithms. I think my previous comment addresses this challenge. We would implement everything in terms of the point concept, not in terms of coordinates.

I think there might be some benefits for Point in terms of dispatch for graphing libraries and the like. But I kind of wish everyone would just use vector, despite its somewhat "fat" interface! (Alternatively, perhaps we should go all the way and "take geometric algebra seriously" sweat_smile)

I don't know if we really want everyone using vectors. It really introduces tons of issues like 3rd-party methods added to SVector causing slow compilation times, lack of proper constraints in the interface, lack of display of intention in algorithms, etc. BTW, we want to implement geometric algebra in the future in Meshes.jl, or at least exterior algebra and calculus.

I don't think we should join these. Geodesy in particular is very domain specific, and the APIs of Rotations and CoordinateTransformations are fundamentally different from each other (linear algebra and function composition, respectively).

I don't think they are fundamentally different, they are all just coordinates and a set of coordinate transformation methods. The fact that the transformations in Geodesy.jl are a bit more complicated and non-linear (and not represented well with linear algebra) doesn't seem to justify an independent development model. If we are all seeking more advanced pipelines with transformations then these should be developed under a common umbrella where they can be composed transparently with the eyes of many developers. In addition, my impression is that most algorithms don't need access to the linear algebra representation of rotations, they just need to take a transformation function, compose with other transformation functions and apply these functions as black boxes. If these functions are implemented with matrices multiplications internally, this is a bonus.

If we could join efforts into a cohesive Coordinates.jl (or similar name) package that would really make things easier for end users and developers. Right now people have to read different repos with different code styles to do very basic things.

juliohm avatar Aug 20 '21 12:08 juliohm

If we ignore the underlying charts, every computation eventually ends in the cartesian system, thus every point can be stored as an SVector. But there might be more short-cuts here and there to avoid the mapping from/to the cartesian system for better performance, and to do this we need to tag the charts information to the point (SVector) or points.

I would like to extends @juliohm's proposal to:

  1. an abstract trait type AbstractCoordinate with Cartesian, Polar and Spherical and maybe more sophisticated charts or atlas. This defines how we map from/to cartesian system.
  2. points and transformations should be typed with the coordinate trait, e.g., Point{Cartesian}(2, 3), Translation{Cartesian}(2, 3). This tells Julia's dispatch system more information about whether it's possible to find a shortcut.
  3. For easy usability, transformations should be expected to apply to generic points at some coordinate system. (This might not be possible for every case, but should be the goal)

For simplicity and backward compatibility, we might assume the default chart as Cartesian, and we might use names PolarChart and SphericalChart as they're currently used as point types and not coordinate type.

johnnychen94 avatar Aug 20 '21 13:08 johnnychen94

I fully support your proposal @johnnychen94. If we design a more cohesive interface via an abstract parent type we will achieve much more in the long run. The details and names of structs could be brainstormed further, but what I think is important right now is having everyone on the boat. An effort like this deserves a new package designed from scratch with this vision in mind, clear code style and multiple eyes reviewing changes.

Unfortunately the name Coordinates.jl was taken and the package is doing something completely off-topic. I will think of a name and will start reviewing the existing packages meanwhile. Hopefully we can achieve something more general soon.

juliohm avatar Aug 20 '21 15:08 juliohm

I would like to extends @juliohm's proposal to:

WIP PR in https://github.com/johnnychen94/CoordinateSystems.jl/pull/1 for preview and discussion.

johnnychen94 avatar Aug 20 '21 21:08 johnnychen94

The fact that the transformations in Geodesy.jl are a bit more complicated and non-linear (and not represented well with linear algebra) doesn't seem to justify an independent development model

I fully agree that the point types used in Geodesy could be improved and systematized with common traits — or whatever is required to make working with coordinate systems more systematic and composable across packages. I just don't think this requires pulling all the code of Geodesy into a single package with other kinds of more common coordinate transformations. But I think it would be fine if Geodesy was made to depend on a package which defines the necessary traits or supertypes to make things work more smoothly.

c42f avatar Aug 21 '21 01:08 c42f

But I think it would be fine if Geodesy was made to depend on a package which defines the necessary traits or supertypes to make things work more smoothly.

Yeah - I'm totally all for that - especially if the ecosystem can rally around something simple and common (e.g. like Tables.jl did for data).

If we design a more cohesive interface via an abstract parent type we will achieve much more in the long run

Personally I've gone away from requiring abstract parent types where that's practical but I fully support specifying well-defined, cohesive interfaces - because if it's well defined then anything else can implement it after the fact and you can use the interface to write generic & reusable code (i.e. you get all the nice "blind composition" stuff).

Working with interfaces has some simple rules - you can always implement an interface A plus extra interfaces B, C, D, etc and still be said to implement interface A. An extension to an interface must always be additive and never subtractive. The pertinent example here is that the vector space interface is a superset of the affine space interface (because a vector space is an affine space). While I feel it may be useful and valuable to define a Point type, I think it's a mistake to forbid a vector from duck typing a point.

Julia won't need to recompile modules every time someone loads a 3rd-party module operating on StaticVector

I haven't understood the cause of latency that we've been discussing here. Any relevant function defined in Base should be specialized in StaticArrays.jl, not in a 3rd-party module. Rotations.jl will define some multiplications on some of it's own subtypes, but things relating to new coordinate systems and transformations generally shouldn't cause mass invalidations (and definining new methods of new functions for SVector won't invalidate anything).

andyferris avatar Aug 21 '21 03:08 andyferris

I find it tough to decide where to be on a spectrum of interface designs with endpoints

  1. permissive, multi-purpose get-out-of-your-way

and

  1. rigorous keep-separate-things-separate

I think it can be challenging to create and use both of these different flavors of interface.

I think of Base.AbstractArray as the first type. It's both a container and a linear-algebra object. An operation like push!(vec, elt) is is not linear algebra. vec .+ scalar, I contend, is also not linear algebra. It's hard to deny how useful it is that it's versatile and conflates these two uses. But it's also hard, for the same reason, to make use of the AbstractVector interface for what is supposed to be a mathemtical / geometric vector and nothing more, since the interface assumes too much in this case.

goretkin avatar Sep 17 '21 16:09 goretkin

I try to think of interfaces and functionality in terms of layers.

AbstractArray is a container. It has containerish interfaces like indexing and push! and broadcast!. Without these things, it's not really an AbstractArray at all, and this stuff is defined directly in Base.

The LinearAlgebra module comes along and says "great, we can view arrays as a vector space; let's now add those definitions". It makes some further assertions about AbstractMatrix being a linear operator over AbstractVectors and goes on to define a series of useful algorithms, representations and transformations.

This is basically the essence of what makes Julia good at composing different packages and functionality. New packages can layer new functionality (or extend existing functionality, like +) onto existing types. Or we can create new types that opt into existing functionality or interfaces ("dictionaries can using indexing, too!").

I think there is a space for programming languages where everything is rigorously kept seperate, but it doesn't seem to the Julia way. To do the "layering" you need to be careful not to stomp on any of the guarantees from the earlier layers - and if you are careful about this then the interfaces can be rigorous and well-defined, as well as convenient.

andyferris avatar Sep 19 '21 12:09 andyferris

The "rigorous, keep things separate" approach definitely cuts both ways:

  • It ensures correct usage
  • It stifles creativity and reduces composition

It is quite hard to have both!

In terms of rigorous interfaces, Julia is quite far on the permissive end of the spectrum. From a software engineering perspective, we don't have the notion of private fields in a struct. Even more crazy, we don't even have a consistent naming convention in the ecosystem for internal fields!

I think of Base.AbstractArray as the first type. It's both a container and a linear-algebra object. An operation like push!(vec, elt) is is not linear algebra.

Yeah, it's a pretty fat interface.

And yet even in numerical linear algebra there's common iterative algorithms which need to increase the size of some matrices and vectors as the algorithm converges. So I feel like it's far from clear which API would be practical and usable. Somewhat-related, fact that numpy.Matrix was deprecated https://numpy.org/doc/stable/reference/generated/numpy.matrix.html is interesting.

c42f avatar Sep 25 '21 00:09 c42f

All great points.

Without knowing the full story I assumed that NumPy's Array vs Matrix distinction arose because e.g. multiplication of one is element-wise, and the other performs matrix multiplication. Since Python 3.5, @ is available as an operator for matrix multiplication.

On Fri, Sep 24, 2021, 8:53 PM Chris Foster @.***> wrote:

The "rigorous, keep things separate" approach definitely cuts both ways:

  • It ensures correct usage
  • It stifles creativity and reduces composition

It is quite hard to have both!

In terms of rigorous interfaces, Julia is quite far on the permissive end of the spectrum. From a software engineering perspective, we don't have the notion of private fields in a struct. Even more crazy, we don't even have a consistent naming convention in the ecosystem for internal fields!

I think of Base.AbstractArray as the first type. It's both a container and a linear-algebra object. An operation like push!(vec, elt) is is not linear algebra.

Yeah, it's a pretty fat interface.

And yet even in numerical linear algebra there's common iterative algorithms which need to increase the size of some matrices and vectors as the algorithm converges. So I feel like it's far from clear which API would be practical and usable. Somewhat-related, fact that numpy.Matrix was deprecated https://numpy.org/doc/stable/reference/generated/numpy.matrix.html is interesting.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JuliaGeometry/CoordinateTransformations.jl/issues/75#issuecomment-926984896, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEN3M2IGSJK4WBC5ZIIDMTUDUMRNANCNFSM5CIROZVQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

goretkin avatar Sep 25 '21 04:09 goretkin