pyresample
pyresample copied to clipboard
Switch to pyproj 2.3+ for Transformer and CRS objects
Background
Pyresample has used old pyproj interfaces for a long time and depends heavily on the Proj class to transform between lon/lat degree coordinates and X/Y projection coordinates. We also depend heavily on PROJ parameter dictionaries and strings. PROJ strings are overall deprecated and are unable of representing all Coordinate Reference Systems (CRS) out there. The PROJ library suggests WellKnownText (WKT).
Forgive me for the lack of references to the claims in this issue.
Suggestions
-
Drop the custom
Projclasses in:https://github.com/pytroll/pyresample/blob/3d6d38d40a30e2ae943dbbe7fe46766db7a77a19/pyresample/_spatial_mp.py#L129-L137
These checks are meant to produce lon/lat degrees when dealing with a geographic CRS (
+proj=latlong). As of pyproj 2.3.0 this is done by default. This current check also makes it so certain geographic CRSes are not possible (ex.+proj=latlong +pm=180). -
Switch to the pyproj
Transformerclass for all currentProjuses. It is the recommended interface and would go something like this (I think):from pyproj import Transformer, CRS t = Transformer.from_crs(CRS('EPSG:4326'), area_def.crs, always_xy=True) x, y = t.transform(lon, lat)Without the
always_xy=Truethis would flip the expected/returned order of the axes. In the case of EPSG:4326, this expects latitude first, longitude second. -
Drop PROJ parameters as internally stored representation of CRS information. See #264 for first attempts at storing WKT instead.
CC @snowman2 (pyproj maintainer). If you have any other suggestions, corrections, or other feedback, please let us know.
For 2:
t = Transformer.from_crs(area_def.crs.geodetic_crs, area_def.crs, always_xy=True)
Extra bonus of this hard requirement for pyresample is that we should then be able to switch our unit tests to do:
produced_area_def.crs == expected_crs
And let pyproj/PROJ handle the comparison complexity.
@snowman2 Thanks for the feedback. If we were using this to transform from traditional geodetic lon/lat coordinates (say from a scientific dataset) to a projected CRS like (ex. +proj=lcc +lat_1=25 +lat_2=35) then doesn't that mean that our transformer would depend on the definition of our target CRS? For example, what if I did +proj=lcc +pm=45 +lat_1=25 +lat_2=35 as my target CRS, then the source CRS of the Transformer would have a prime meridian of 45 degrees instead of the expected 0, right?
Randomly inserted, but thought it might be helpful: https://pyproj4.github.io/pyproj/stable/gotchas.html#proj-not-a-generic-latitude-longitude-to-projection-converter
If we were using this to transform from traditional geodetic lon/lat coordinates (say from a scientific dataset) to a projected CRS like (ex. +proj=lcc +lat_1=25 +lat_2=35) then doesn't that mean that our transformer would depend on the definition of our target CRS?
The example I gave was if you wanted behavior similar to the Proj class as you were using it. See the link above for deciding if you want to use EPSG:4326 or the geodetic_crs property.
Perfect! I think we want EPSG:4326 to do things the "right" way, but have been using Proj as that FAQ recommends not doing :wink:
In the long run, in Satpy, I hope to be explicitly providing a CRS for the input data (could be geographic lon/lats or projected X/Y coordinates) and an explicit CRS for the output data.
To make things clear, are CRS object still not thread safe ?
Correct. CRS and Proj objects (and Transform objects @snowman2 ?) are not thread-safe. We should be sending the projection parameters to the threads (which can be easily serialized) and then create the CRS/Proj objects in each thread.
It depends how you use them. They are not safe to share between multiple threads, but they are safe when used by only one thread at a time. This is also true for the Transformer, Proj, and Geod classes I believe.
Sounds good @djhoese ! And thanks @snowman2 for your feedback! Another bonus: Seems like the transformation of space pixels in the geostationary projection to lat/lon would then return inf instead of 1e+30:
>>> latlong = CRS.from_dict({'proj': 'latlong'})
>>> geos = CRS.from_dict({'proj': 'geos', 'h': 35785831.0, 'units': 'm', 'lon_0': 0})
>>> t = Transformer.from_crs(geos, regular)
>>> t.transform(-5570248.686685662, -5567248.28340708)
(inf, inf)
Seems like the transformation of space pixels in the geostationary projection to lat/lon would then return inf instead of 1e+30
@sfinkens This has been this way for a long time. I think even with the Proj interface.
Oh... Nevermind then :)
Another thing I'm realizing needs to be decided on: How does the AreaDefinition.__str__ get changed? It currently produces:
Area ID: test
Description: test
Projection ID: test
Projection: {'datum': 'WGS84', 'lat_0': '0', 'lat_1': '25', 'lat_2': '25', 'lon_0': '-80', 'no_defs': 'None', 'proj': 'lcc', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 1000
Number of rows: 1000
Area extent: (-3459087.6581, 1622996.9636, 1108961.8103, 8007863.9335)
Are we OK changing "Projection" to showing the WKT? I'm sure this will break some tests, but is something we should probably do.
Using wkt in the string output will be way too verbose imo.
Yes, but if the PROJ string doesn't fully describe a projection (not the case for some/most of our cases) then users will fall in to the trap of using __str__ output for comparing two AreaDefinitions and being wrong. I agree though...not great.
One option would be to use the __repr__ of the CRS class.
Update on thread safety: https://github.com/pyproj4/pyproj/issues/782
@snowman2 are the thread safety changes released yet?
And are there any performance concerns? Would passing two CRS WKT strings to another thread, then creating a Transformer object be better/faster than sending the transformer object itself?
are the thread safety changes released yet?
The will be included in the 3.1 release.
And are there any performance concerns?
I am not aware of any,
Would passing two CRS WKT strings to another thread, then creating a Transformer object be better/faster than sending the transformer object itself?
Probably about the same.