pyresample icon indicating copy to clipboard operation
pyresample copied to clipboard

Switch to pyproj 2.3+ for Transformer and CRS objects

Open djhoese opened this issue 5 years ago • 20 comments

Background

Pyresample has used old pyproj interfaces for a long time and depends heavily on the Proj class to transform between lon/lat degree coordinates and X/Y projection coordinates. We also depend heavily on PROJ parameter dictionaries and strings. PROJ strings are overall deprecated and are unable of representing all Coordinate Reference Systems (CRS) out there. The PROJ library suggests WellKnownText (WKT).

Forgive me for the lack of references to the claims in this issue.

Suggestions

  1. Drop the custom Proj classes in:

    https://github.com/pytroll/pyresample/blob/3d6d38d40a30e2ae943dbbe7fe46766db7a77a19/pyresample/_spatial_mp.py#L129-L137

    These checks are meant to produce lon/lat degrees when dealing with a geographic CRS (+proj=latlong). As of pyproj 2.3.0 this is done by default. This current check also makes it so certain geographic CRSes are not possible (ex. +proj=latlong +pm=180).

  2. Switch to the pyproj Transformer class for all current Proj uses. It is the recommended interface and would go something like this (I think):

    from pyproj import Transformer, CRS
    t = Transformer.from_crs(CRS('EPSG:4326'), area_def.crs, always_xy=True)
    x, y = t.transform(lon, lat)
    

    Without the always_xy=True this would flip the expected/returned order of the axes. In the case of EPSG:4326, this expects latitude first, longitude second.

  3. Drop PROJ parameters as internally stored representation of CRS information. See #264 for first attempts at storing WKT instead.

CC @snowman2 (pyproj maintainer). If you have any other suggestions, corrections, or other feedback, please let us know.

djhoese avatar Apr 13 '20 15:04 djhoese

For 2:

t = Transformer.from_crs(area_def.crs.geodetic_crs, area_def.crs, always_xy=True)

snowman2 avatar Apr 13 '20 15:04 snowman2

Extra bonus of this hard requirement for pyresample is that we should then be able to switch our unit tests to do:

produced_area_def.crs == expected_crs

And let pyproj/PROJ handle the comparison complexity.

djhoese avatar Apr 13 '20 15:04 djhoese

@snowman2 Thanks for the feedback. If we were using this to transform from traditional geodetic lon/lat coordinates (say from a scientific dataset) to a projected CRS like (ex. +proj=lcc +lat_1=25 +lat_2=35) then doesn't that mean that our transformer would depend on the definition of our target CRS? For example, what if I did +proj=lcc +pm=45 +lat_1=25 +lat_2=35 as my target CRS, then the source CRS of the Transformer would have a prime meridian of 45 degrees instead of the expected 0, right?

djhoese avatar Apr 13 '20 15:04 djhoese

Randomly inserted, but thought it might be helpful: https://pyproj4.github.io/pyproj/stable/gotchas.html#proj-not-a-generic-latitude-longitude-to-projection-converter

snowman2 avatar Apr 13 '20 16:04 snowman2

If we were using this to transform from traditional geodetic lon/lat coordinates (say from a scientific dataset) to a projected CRS like (ex. +proj=lcc +lat_1=25 +lat_2=35) then doesn't that mean that our transformer would depend on the definition of our target CRS?

The example I gave was if you wanted behavior similar to the Proj class as you were using it. See the link above for deciding if you want to use EPSG:4326 or the geodetic_crs property.

snowman2 avatar Apr 13 '20 16:04 snowman2

Perfect! I think we want EPSG:4326 to do things the "right" way, but have been using Proj as that FAQ recommends not doing :wink:

In the long run, in Satpy, I hope to be explicitly providing a CRS for the input data (could be geographic lon/lats or projected X/Y coordinates) and an explicit CRS for the output data.

djhoese avatar Apr 13 '20 17:04 djhoese

To make things clear, are CRS object still not thread safe ?

mraspaud avatar Apr 14 '20 07:04 mraspaud

Correct. CRS and Proj objects (and Transform objects @snowman2 ?) are not thread-safe. We should be sending the projection parameters to the threads (which can be easily serialized) and then create the CRS/Proj objects in each thread.

djhoese avatar Apr 14 '20 12:04 djhoese

It depends how you use them. They are not safe to share between multiple threads, but they are safe when used by only one thread at a time. This is also true for the Transformer, Proj, and Geod classes I believe.

snowman2 avatar Apr 14 '20 12:04 snowman2

Sounds good @djhoese ! And thanks @snowman2 for your feedback! Another bonus: Seems like the transformation of space pixels in the geostationary projection to lat/lon would then return inf instead of 1e+30:

>>> latlong = CRS.from_dict({'proj': 'latlong'})
>>> geos = CRS.from_dict({'proj': 'geos', 'h': 35785831.0, 'units': 'm', 'lon_0': 0})
>>> t = Transformer.from_crs(geos, regular)
>>> t.transform(-5570248.686685662, -5567248.28340708)
(inf, inf)

sfinkens avatar Apr 22 '20 14:04 sfinkens

Seems like the transformation of space pixels in the geostationary projection to lat/lon would then return inf instead of 1e+30

@sfinkens This has been this way for a long time. I think even with the Proj interface.

djhoese avatar Apr 22 '20 14:04 djhoese

Oh... Nevermind then :)

sfinkens avatar Apr 22 '20 14:04 sfinkens

Another thing I'm realizing needs to be decided on: How does the AreaDefinition.__str__ get changed? It currently produces:

Area ID: test
Description: test
Projection ID: test
Projection: {'datum': 'WGS84', 'lat_0': '0', 'lat_1': '25', 'lat_2': '25', 'lon_0': '-80', 'no_defs': 'None', 'proj': 'lcc', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 1000
Number of rows: 1000
Area extent: (-3459087.6581, 1622996.9636, 1108961.8103, 8007863.9335)

Are we OK changing "Projection" to showing the WKT? I'm sure this will break some tests, but is something we should probably do.

djhoese avatar May 23 '20 02:05 djhoese

Using wkt in the string output will be way too verbose imo.

mraspaud avatar Jun 06 '20 05:06 mraspaud

Yes, but if the PROJ string doesn't fully describe a projection (not the case for some/most of our cases) then users will fall in to the trap of using __str__ output for comparing two AreaDefinitions and being wrong. I agree though...not great.

djhoese avatar Jun 06 '20 15:06 djhoese

One option would be to use the __repr__ of the CRS class.

snowman2 avatar Jun 06 '20 16:06 snowman2

Update on thread safety: https://github.com/pyproj4/pyproj/issues/782

snowman2 avatar Mar 19 '21 01:03 snowman2

@snowman2 are the thread safety changes released yet?

djhoese avatar Mar 22 '21 19:03 djhoese

And are there any performance concerns? Would passing two CRS WKT strings to another thread, then creating a Transformer object be better/faster than sending the transformer object itself?

djhoese avatar Mar 22 '21 19:03 djhoese

are the thread safety changes released yet?

The will be included in the 3.1 release.

And are there any performance concerns?

I am not aware of any,

Would passing two CRS WKT strings to another thread, then creating a Transformer object be better/faster than sending the transformer object itself?

Probably about the same.

snowman2 avatar Mar 22 '21 20:03 snowman2