regions icon indicating copy to clipboard operation
regions copied to clipboard

Use "Shapely" or "sphere" for Astropy regions?

Open cdeil opened this issue 8 years ago • 18 comments

Should we use Shapely and / or Sphere and / or neither (i.e. re-implement or copy over things like polygon computations) as a dependency for this to-be astropy.regions package?

I don't remember if this important question was discussed in detail or decided at PyAstro16!?

Currently the API has shapely in many places, but nothing is really implemented and @astrofrog, I think it might just be the left-over of some API brainstorming a year or two ago? https://github.com/astropy/regions/search?utf8=%E2%9C%93&q=shapely

@eteq @astrofrog @keflavich @larrybradley, all - thoughts?

cdeil avatar Jun 26 '16 09:06 cdeil

My personal opinion is that initially it's fine to reuse as much as possible from either of these packages because there is a lot to implement in regions and might as well avoid re-implementing things. I would prefer to see a feature complete regions package that has dependencies than a half finished package that does not 😆

astrofrog avatar Jun 27 '16 22:06 astrofrog

One possibility would be to isolate all the calls to shapely behind some helper functions in this package so that it's easy to replace later with our own code if needed.

astrofrog avatar Jun 27 '16 22:06 astrofrog

I agree 100% that getting functionality is the most important thing, and it's OK to use shapely and sphere to implement methods.

So am I understanding correctly that shapely will likely not be part of the public API at the end? If we use it e.g. for polygon or compound region computations, we will wrap it and expose our own API for those things?

At the moment Shapely is part of the PixelRegion base class public API: http://astropy-regions.readthedocs.io/en/latest/api/regions.PixelRegion.html#regions.PixelRegion.to_shapely

There's a somewhat weird implementation here which I guess produces a polygon? http://astropy-regions.readthedocs.io/en/latest/_modules/regions/shapes/circle.html#CirclePixelRegion.to_shapely

And here it's not implemented: http://astropy-regions.readthedocs.io/en/latest/_modules/regions/shapes/rectangle.html#RectanglePixelRegion.to_shapely

I guess my question is: should we remove to_shapely from the public API? Remove it from the PixelRegion base class completely? Or do we want to have the to / from Shapely from all pixel regions for some reason?

Either way is fine with me ... I just want to understand what should be done, especially concerning compound regions it's not clear to me yet how they should be implemented. A tree of connected region objects? Always go to polygons and use Shapely?

cdeil avatar Jun 28 '16 08:06 cdeil

So there are two aspects here - I think that if we use shapely or sphere for actual calculations, then that should be hidden - but I also don't think there's anything wrong with having a to_shapely method in the public API (irrespective of whether we use shapely behind the scenes). I think there are genuine use cases where one might want to convert regions to say matplotlib patches or shapely polygons?

astrofrog avatar Jun 28 '16 09:06 astrofrog

The disscussion for shapely is resolved in #209 .

I'll keep this issue open for now, since we still don't have spherical polygons, so the discussion on whether to use sphere or to develop and maintain spherical polygon code in astropy.regions is still open. I don't think anyone has time to hack on this at the moment anyways, but maybe @larrybradley or someone from STSCI could find out if there are any plans concerning the sphere package? Do you want to maintain it? Do you consider it Astropy core material?

IMO having spherical polygon in astropy core would be good, and astropy.regions would be the appropriate place. I think the sphere package seems way too much code for that task, it should be possible to re-implement or extract spherical polygon functionality with ~ 1000 lines or less of C or Cython. But I don't really know and unfortunately don't have time to work on this. This could also be an interesting and well-defined GSoC project for next year?

cdeil avatar Aug 22 '18 12:08 cdeil

@cdeil I think the spherical_geometry package is something that should be considered for astropy core and we'll be glad to have more community involvement with maintaining the package.

larrybradley avatar Sep 05 '18 16:09 larrybradley

I'd like to come back to this question of how to implement regions.PolygonSkyRegion.

The most important operation is "contains", i.e. we need a "point in spherical polygon" method.

I'm starting a design document for Regions in Gammapy, and having spherical polygons available could be a brute-force way to support any spherical region (users should polygonise). We don't need much, basically just "point in spherical polygon", and then we can filter tables of sky positions, or make masks for sky images by taking pixel center coordinate grids and calling "point in polygon".

From what I gather, this is difficult to implement, and the work has been done in https://github.com/spacetelescope/spherical_geometry , so one option could be to just import and use that. The concern with that is that it relies on a bundled C++ library for high-precision float computations, and C++ is not as portable as just Numpy, C or Cython. E.g. currently if I try to pip install spherical-geometry with conda on macOS I get:

    spherical_geometry/../cextern/qd-library/src/bits.cpp:14:10: fatal error: 'iostream' file not found
    #include <iostream>
             ^~~~~~~~~~
    1 warning and 1 error generated.
    error: command 'gcc' failed with exit status 1

conda install spherical-geometry does work, but the point remains - we'd drag C++ in, and that is not as portable as C, and has been avoided so far for Astropy core and astropy-regions.

I guess spherical-geometry could be a delayed import, so it could be an optional dependency, but still, if we start using spherical polygons in Gammapy it's needed for analysis and effectively becomes a core dependency that must be easy to install.


How hard is it to implement spherical point in polygon from scratch, just using Numpy, or with a C or Cython extension?

I tried to look around a bit at https://github.com/lsst/sphgeom or https://github.com/google/s2geometry and found this, and it seems pretty complicated. Does it really require a high-precision math library? Does a C implementation exist?

@fxpineau - maybe you have the code to do it as part of your CDS Java or Rust lib? If yes, could you please link to the relevant bits?

@larrybradley @astrofrog or anyone - thoughts?

cdeil avatar May 16 '19 16:05 cdeil

@cdeil you can either look at the Rust code, struct Polygon (from line 86 to 118), or at the Java code, method contains at line 200.

fxpineau avatar May 16 '19 17:05 fxpineau

@fxpineau - Thanks!

But these parts of the code aren't isolated, they use your other classes and methods.

Can you comment on these questions:

  • How much C or Cython code would be needed to just have a minimal spherical "point in polygon" with your method? I mean something like what we did in https://github.com/astropy/regions/blob/75bc04d2af6a85a30c95389069de9a87356f2d1b/regions/_geometry/pnpoly.pyx#L91-L116 - for planar case it's just 20 lines.
  • Do you just use float64 math, or is there a double float64 code somewhere behind, like there is in spherical-geometry with the https://github.com/spacetelescope/spherical_geometry/tree/master/cextern/qd-library ?

Sorry for being lazy - I'll have a closer look at your code tomorrow or Monday latest.

cdeil avatar May 16 '19 17:05 cdeil

@cdeil

  • Maybe 100-200 lines. But if you can assume that a polygon has all its vertices in a semi-hemisphere, then you can just apply the Gnomonic projection (the astropy WCS code is in C?) and then apply the existing 2D polygon code on the projected vertices (great-circles are projected as strait lines in the Gnomonic proj).
  • Just float64, not sure more is needed in our case.

fxpineau avatar May 16 '19 17:05 fxpineau

Maybe 100-200 lines

That's not so bad.

Then I think we should try to extract from your code and avoid any extra dependency in astropy-regions for not. Especially not add a dependency on spherical_geometry and introduce C++.

Just float64, not sure more is needed in our case.

@mcara - can you try to find out and put a note in the spherical_geometry docs somewhere why qd-library is used? Is it for point in polygon, or only for more fancy methods?

But if you can assume that a polygon has all its vertices in a semi-hemisphere, then you can just apply the Gnomonic projection (the astropy WCS code is in C?) and then apply the existing 2D polygon code on the projected vertices (great-circles are projected as strait lines in the Gnomonic proj).

Hmm, yes, that would be very simple to implement. (yes astropy.wcs just is a Python C extension wrapping WCSLib - no C++ or extra dependency involved).

But is it really useful to have this? I mean - very quickly people will come with other polygons that don't fit in a hemisphere, no?


I see Boost also has some polygon stuff:

https://www.boost.org/doc/libs/1_70_0/libs/polygon/doc/index.htm

But again if you look at the spherical polygon code - it's very complex (like in the Google S2 code):

https://github.com/boostorg/geometry/blob/develop/include/boost/geometry/strategies/spherical/point_in_poly_winding.hpp

@fxpineau - I'll look at your Rust or Java code and try to extract a standalone version and port it to C next week.

cdeil avatar May 16 '19 17:05 cdeil

It is also worth mentioning that @mbtaylor implemented the spherical polygon contains method in TOPCAT a few monthes ago, see his code. The approach is very similar (derived from the 2D algo, adapted for the sphere). The main limitation for you may be the licence (LGPL). The TOPCAT code is probably faster if you want to test one point with one polygon. The CDS Healpix code is probably faster if you want to test several points with one polygon. (For the later, it is basically a mater of pre-computing or not the vectorial product of all consecutive pair of vertices).

fxpineau avatar May 16 '19 22:05 fxpineau

My code that @fxpineau mentions above is quite self-contained, and only a couple of hundred lines including comments. He's made a couple of suggestions to tweak it which I will probably implement in the next few days, but they don't affect the algorithm. I agree with his assessment of efficiency for different usage scenarios. If it helps for me to declare this available under some different license, I'm quite happy to do that.

mbtaylor avatar May 17 '19 06:05 mbtaylor

@fxpineau @mbtaylor - Thanks!

Concerning performance, I think having many points to test, in the form of Numpy arrays, is the common and important case. E.g. when you have an event list, or a source catalog, or an array of sky coordinates which correspond to pixel center locations when making a mask from a region.

One thing we'll have to have a closer look at and decide is how inside and outside are defined.

See here and here how it's done in spherical_geometry. Do we want the same, i.e. add an "inside" data member to the class to give the user the choice, plus have this default if not given? Might also be useful to compare what DS9 does, since many people use DS9 for display or for region serialisation.

See the notes here and here how @fxpineau and @mbtaylor currently define inside.

cdeil avatar May 17 '19 06:05 cdeil

I add @tomdonaldson, he is interesting in testing one point with several polygons in a Web Browser. (I made a WebAssembly demo here, but he may be interesting in porting the @mbtaylor code).

fxpineau avatar May 17 '19 07:05 fxpineau

@cdeil in the Java version, I implemented several options to make the difference between the inside and the outside, see e.g. this one.

For non-self-intersecting polygons, the inside/outside often depends on the order in which the vertices are provided.

fxpineau avatar May 17 '19 07:05 fxpineau

I would suggest to do the same as spherical_geometry.SphericalPolygon, i.e. to give the caller the power to define a point that's inside which gives max flexibility, and to have their simple default of mean as inside. Just to be clear: the only reason to not import spherical_geometry.SphericalPolygon and wrap it here in a new API for astropy.regions.SkyPolygon is to avoid introducing C++ here for astropy-regions (which eventually should be come astropy.regions in the core).

Would everyone be happy with that way to define inside / outside? And concerning implementation: add a few 100 lines of C & Cython for spherical polygons to this astropy-regions package, based on the existing code in other packages?

cdeil avatar May 17 '19 07:05 cdeil

@cdeil It looks like C code is used in a lot of places. The reason for using this library, I think, is to mitigate the accuracy loss when computing cross products between close to each other vectors, i.e., when testing whether a point lies on an arc defined by two end points, etc. I am currently learning the code and my hope is that computations might be re-arranged in a different way in order to reduce loss of accuracy.

mcara avatar May 18 '19 07:05 mcara