regions
regions copied to clipboard
Use "Shapely" or "sphere" for Astropy regions?
Should we use Shapely and / or Sphere and / or neither (i.e. re-implement or copy over things like polygon computations) as a dependency for this to-be astropy.regions
package?
I don't remember if this important question was discussed in detail or decided at PyAstro16
!?
Currently the API has shapely
in many places, but nothing is really implemented and @astrofrog, I think it might just be the left-over of some API brainstorming a year or two ago?
https://github.com/astropy/regions/search?utf8=%E2%9C%93&q=shapely
@eteq @astrofrog @keflavich @larrybradley, all - thoughts?
My personal opinion is that initially it's fine to reuse as much as possible from either of these packages because there is a lot to implement in regions and might as well avoid re-implementing things. I would prefer to see a feature complete regions package that has dependencies than a half finished package that does not 😆
One possibility would be to isolate all the calls to shapely behind some helper functions in this package so that it's easy to replace later with our own code if needed.
I agree 100% that getting functionality is the most important thing, and it's OK to use shapely
and sphere
to implement methods.
So am I understanding correctly that shapely
will likely not be part of the public API at the end?
If we use it e.g. for polygon or compound region computations, we will wrap it and expose our own API for those things?
At the moment Shapely is part of the PixelRegion
base class public API:
http://astropy-regions.readthedocs.io/en/latest/api/regions.PixelRegion.html#regions.PixelRegion.to_shapely
There's a somewhat weird implementation here which I guess produces a polygon? http://astropy-regions.readthedocs.io/en/latest/_modules/regions/shapes/circle.html#CirclePixelRegion.to_shapely
And here it's not implemented: http://astropy-regions.readthedocs.io/en/latest/_modules/regions/shapes/rectangle.html#RectanglePixelRegion.to_shapely
I guess my question is: should we remove to_shapely
from the public API? Remove it from the PixelRegion
base class completely? Or do we want to have the to / from Shapely from all pixel regions for some reason?
Either way is fine with me ... I just want to understand what should be done, especially concerning compound regions it's not clear to me yet how they should be implemented. A tree of connected region objects? Always go to polygons and use Shapely?
So there are two aspects here - I think that if we use shapely or sphere for actual calculations, then that should be hidden - but I also don't think there's anything wrong with having a to_shapely
method in the public API (irrespective of whether we use shapely behind the scenes). I think there are genuine use cases where one might want to convert regions to say matplotlib patches or shapely polygons?
The disscussion for shapely is resolved in #209 .
I'll keep this issue open for now, since we still don't have spherical polygons, so the discussion on whether to use sphere
or to develop and maintain spherical polygon code in astropy.regions
is still open. I don't think anyone has time to hack on this at the moment anyways, but maybe @larrybradley or someone from STSCI could find out if there are any plans concerning the sphere
package? Do you want to maintain it? Do you consider it Astropy core material?
IMO having spherical polygon in astropy core would be good, and astropy.regions
would be the appropriate place. I think the sphere package seems way too much code for that task, it should be possible to re-implement or extract spherical polygon functionality with ~ 1000 lines or less of C or Cython. But I don't really know and unfortunately don't have time to work on this. This could also be an interesting and well-defined GSoC project for next year?
@cdeil I think the spherical_geometry package is something that should be considered for astropy core and we'll be glad to have more community involvement with maintaining the package.
I'd like to come back to this question of how to implement regions.PolygonSkyRegion.
The most important operation is "contains", i.e. we need a "point in spherical polygon" method.
I'm starting a design document for Regions in Gammapy, and having spherical polygons available could be a brute-force way to support any spherical region (users should polygonise). We don't need much, basically just "point in spherical polygon", and then we can filter tables of sky positions, or make masks for sky images by taking pixel center coordinate grids and calling "point in polygon".
From what I gather, this is difficult to implement, and the work has been done in https://github.com/spacetelescope/spherical_geometry , so one option could be to just import and use that. The concern with that is that it relies on a bundled C++ library for high-precision float computations, and C++ is not as portable as just Numpy, C or Cython. E.g. currently if I try to pip install spherical-geometry
with conda on macOS I get:
spherical_geometry/../cextern/qd-library/src/bits.cpp:14:10: fatal error: 'iostream' file not found
#include <iostream>
^~~~~~~~~~
1 warning and 1 error generated.
error: command 'gcc' failed with exit status 1
conda install spherical-geometry
does work, but the point remains - we'd drag C++ in, and that is not as portable as C, and has been avoided so far for Astropy core and astropy-regions.
I guess spherical-geometry
could be a delayed import, so it could be an optional dependency, but still, if we start using spherical polygons in Gammapy it's needed for analysis and effectively becomes a core dependency that must be easy to install.
How hard is it to implement spherical point in polygon from scratch, just using Numpy, or with a C or Cython extension?
I tried to look around a bit at https://github.com/lsst/sphgeom or https://github.com/google/s2geometry and found this, and it seems pretty complicated. Does it really require a high-precision math library? Does a C implementation exist?
@fxpineau - maybe you have the code to do it as part of your CDS Java or Rust lib? If yes, could you please link to the relevant bits?
@larrybradley @astrofrog or anyone - thoughts?
@cdeil you can either look at the Rust code, struct Polygon (from line 86 to 118), or at the Java code, method contains at line 200.
@fxpineau - Thanks!
But these parts of the code aren't isolated, they use your other classes and methods.
Can you comment on these questions:
- How much C or Cython code would be needed to just have a minimal spherical "point in polygon" with your method? I mean something like what we did in https://github.com/astropy/regions/blob/75bc04d2af6a85a30c95389069de9a87356f2d1b/regions/_geometry/pnpoly.pyx#L91-L116 - for planar case it's just 20 lines.
- Do you just use
float64
math, or is there a doublefloat64
code somewhere behind, like there is in spherical-geometry with the https://github.com/spacetelescope/spherical_geometry/tree/master/cextern/qd-library ?
Sorry for being lazy - I'll have a closer look at your code tomorrow or Monday latest.
@cdeil
- Maybe 100-200 lines. But if you can assume that a polygon has all its vertices in a semi-hemisphere, then you can just apply the Gnomonic projection (the astropy WCS code is in C?) and then apply the existing 2D polygon code on the projected vertices (great-circles are projected as strait lines in the Gnomonic proj).
- Just float64, not sure more is needed in our case.
Maybe 100-200 lines
That's not so bad.
Then I think we should try to extract from your code and avoid any extra dependency in astropy-regions
for not. Especially not add a dependency on spherical_geometry
and introduce C++.
Just float64, not sure more is needed in our case.
@mcara - can you try to find out and put a note in the spherical_geometry
docs somewhere why qd-library
is used? Is it for point in polygon, or only for more fancy methods?
But if you can assume that a polygon has all its vertices in a semi-hemisphere, then you can just apply the Gnomonic projection (the astropy WCS code is in C?) and then apply the existing 2D polygon code on the projected vertices (great-circles are projected as strait lines in the Gnomonic proj).
Hmm, yes, that would be very simple to implement. (yes astropy.wcs
just is a Python C extension wrapping WCSLib - no C++ or extra dependency involved).
But is it really useful to have this? I mean - very quickly people will come with other polygons that don't fit in a hemisphere, no?
I see Boost also has some polygon stuff:
https://www.boost.org/doc/libs/1_70_0/libs/polygon/doc/index.htm
But again if you look at the spherical polygon code - it's very complex (like in the Google S2 code):
https://github.com/boostorg/geometry/blob/develop/include/boost/geometry/strategies/spherical/point_in_poly_winding.hpp
@fxpineau - I'll look at your Rust or Java code and try to extract a standalone version and port it to C next week.
It is also worth mentioning that @mbtaylor implemented the spherical polygon contains method in TOPCAT a few monthes ago, see his code. The approach is very similar (derived from the 2D algo, adapted for the sphere). The main limitation for you may be the licence (LGPL). The TOPCAT code is probably faster if you want to test one point with one polygon. The CDS Healpix code is probably faster if you want to test several points with one polygon. (For the later, it is basically a mater of pre-computing or not the vectorial product of all consecutive pair of vertices).
My code that @fxpineau mentions above is quite self-contained, and only a couple of hundred lines including comments. He's made a couple of suggestions to tweak it which I will probably implement in the next few days, but they don't affect the algorithm. I agree with his assessment of efficiency for different usage scenarios. If it helps for me to declare this available under some different license, I'm quite happy to do that.
@fxpineau @mbtaylor - Thanks!
Concerning performance, I think having many points to test, in the form of Numpy arrays, is the common and important case. E.g. when you have an event list, or a source catalog, or an array of sky coordinates which correspond to pixel center locations when making a mask from a region.
One thing we'll have to have a closer look at and decide is how inside and outside are defined.
See here and here how it's done in spherical_geometry
. Do we want the same, i.e. add an "inside" data member to the class to give the user the choice, plus have this default if not given? Might also be useful to compare what DS9 does, since many people use DS9 for display or for region serialisation.
See the notes here and here how @fxpineau and @mbtaylor currently define inside.
I add @tomdonaldson, he is interesting in testing one point with several polygons in a Web Browser. (I made a WebAssembly demo here, but he may be interesting in porting the @mbtaylor code).
@cdeil in the Java version, I implemented several options to make the difference between the inside and the outside, see e.g. this one.
For non-self-intersecting polygons, the inside/outside often depends on the order in which the vertices are provided.
I would suggest to do the same as spherical_geometry.SphericalPolygon, i.e. to give the caller the power to define a point that's inside
which gives max flexibility, and to have their simple default of mean as inside. Just to be clear: the only reason to not import spherical_geometry.SphericalPolygon
and wrap it here in a new API for astropy.regions.SkyPolygon
is to avoid introducing C++ here for astropy-regions
(which eventually should be come astropy.regions
in the core).
Would everyone be happy with that way to define inside / outside?
And concerning implementation: add a few 100 lines of C & Cython for spherical polygons to this astropy-regions
package, based on the existing code in other packages?
@cdeil It looks like C code is used in a lot of places. The reason for using this library, I think, is to mitigate the accuracy loss when computing cross products between close to each other vectors, i.e., when testing whether a point lies on an arc defined by two end points, etc. I am currently learning the code and my hope is that computations might be re-arranged in a different way in order to reduce loss of accuracy.