pyproj
pyproj copied to clipboard
Export as WKT1_ESRI or WKT1_GDAL silently returns None for custom CRS
Code Sample, a copy-pastable example if possible
import pyproj
crs_wkt = 'PROJCRS["unknown",BASEGEOGCRS["unknown",DATUM["unknown",ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1,ID["EPSG",9001]]]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8901]]],CONVERSION["unknown",METHOD["Equidistant Cylindrical",ID["EPSG",1028]],PARAMETER["Latitude of 1st standard parallel",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8823]],PARAMETER["Longitude of natural origin",0,ANGLEUNIT["degree",0.0174532925199433],ID["EPSG",8802]],PARAMETER["False easting",0,LENGTHUNIT["unknown",111319.490793274],ID["EPSG",8806]],PARAMETER["False northing",0,LENGTHUNIT["unknown",111319.490793274],ID["EPSG",8807]]],CS[Cartesian,3],AXIS["(E)",east,ORDER[1],LENGTHUNIT["unknown",111319.490793274]],AXIS["(N)",north,ORDER[2],LENGTHUNIT["unknown",111319.490793274]],AXIS["ellipsoidal height (h)",up,ORDER[3],LENGTHUNIT["metre",1,ID["EPSG",9001]]]]'
crs = pyproj.CRS.from_user_input(crs_wkt)
crs.to_wkt(version="WKT1_ESRI")
Problem description
Export of the custom CRS used above to WKT using version="WKT1_ESRI"
returns None. My assumption is that the CRS cannot be expressed as WKT1_ESRI but pyproj does not raise or warn about it but happily returns None
instead. The CRS is coming from https://github.com/geopandas/geopandas/issues/2387.
Expected Output
I would expect this to raise or give me some sort of indication that the export did not happen.
Environment Information
pyproj info:
pyproj: 3.2.1
PROJ: 8.1.1
data dir: /Users/martin/mambaforge/envs/geo_dev/share/proj
user_data_dir: /Users/martin/Library/Application Support/proj
System:
python: 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:24:02) [Clang 11.1.0 ]
executable: /Users/martin/mambaforge/envs/geo_dev/bin/python
machine: macOS-12.3-arm64-arm-64bit
Python deps:
certifi: 2021.10.08
pip: 21.3.1
setuptools: 58.5.3
Cython: 0.29.24
Installation method
- conda-forge
Conda environment information (if you installed with conda):
Environment (
conda list
):
$ conda list proj
# packages in environment at /Users/martin/mambaforge/envs/geo_dev:
#
# Name Version Build Channel
proj 8.1.1 h2d984c1_2 conda-forge
pyproj 3.2.1 py39had8e633_2 conda-forge
Details about
conda
and system ( conda info
):
$ conda info
active environment : geo_dev
active env location : /Users/martin/mambaforge/envs/geo_dev
shell level : 2
user config file : /Users/martin/.condarc
populated config files : /Users/martin/mambaforge/.condarc
/Users/martin/.condarc
conda version : 4.10.3
conda-build version : not installed
python version : 3.9.7.final.0
virtual packages : __osx=12.3=0
__unix=0=0
__archspec=1=arm64
base environment : /Users/martin/mambaforge (writable)
conda av data dir : /Users/martin/mambaforge/etc/conda
conda av metadata url : None
channel URLs : https://conda.anaconda.org/conda-forge/osx-arm64
https://conda.anaconda.org/conda-forge/noarch
package cache : /Users/martin/mambaforge/pkgs
/Users/martin/.conda/pkgs
envs directories : /Users/martin/mambaforge/envs
/Users/martin/.conda/envs
platform : osx-arm64
user-agent : conda/4.10.3 requests/2.26.0 CPython/3.9.7 Darwin/21.4.0 OSX/12.3
UID:GID : 501:20
netrc file : None
offline mode : False
I just realised that the same happens when trying to export to WKT1_GDAL as well.
The None
return is expected and the behavior is consistent with the to_epsg()
method where there isn't always an EPSG code in the response. I think it would make sense for geopandas to check for a None
response from to_wkt
.
I am seeing that the typing/docstring should be updated for the to_wkt
methods for the CRS/_CRS classes to specify that the response is an Optional[str]
.
To get error more descriptive information from PROJ, see: https://pyproj4.github.io/pyproj/stable/advanced_examples.html#debugging-internal-proj
The None return is expected and the behavior is consistent with the to_epsg() method where there isn't always an EPSG code in the response.
Shouldn't pyproj
at least warn about that? In both to_wkt
and to_epsg
.
Shouldn't pyproj at least warn about that? In both to_wkt and to_epsg.
Runtime warnings sound like they could be useful. Is this something you are interested in implementing?
Is this something you are interested in implementing?
I can make a PR for that, yes. Are there any other methods with the same behaviour that should be treated in the same way?
Are there any other methods with the same behaviour that should be treated in the same way?
The main one to be concerned about is to_authority
. So far I haven't seen to_json
or to_proj4
return None, but it is technically possible for them to return None as well.
Thanks :+1:
The
None
return is expected and the behavior is consistent with theto_epsg()
method where there isn't always an EPSG code in the response. I think it would make sense for geopandas to check for aNone
response fromto_wkt
.
Personally I wouldn't necessarily expect to_epsg
and to_wkt
to be consistent in this case. For to_epsg
I know that not every CRS has a corresponding EPSG number and so this might not always return something. But when doing to_wkt
, I want to get a WKT version of the CRS, and generally I would assume that every CRS can be represented in some WKT form (although this assumption thus appears to be wrong depending on the WKT version ...).
I was thinking about this last night and had the exact same thoughts as @jorisvandenbossche.
Here is an example: https://github.com/opendatacube/odc-geo/blob/d1e45e366a511cf690d36155daccac73241e52eb/odc/geo/crs.py#L34-38.
I am starting to think that a warning only on to_wkt
makes sense. What are your thoughts @martinfleis ?
For to_wkt
, I would even consider making it an error (and propagate the actual error message from PROJ).
(although I suppose that's a bigger change)
I am starting to think that a warning only on
to_wkt
makes sense.
While you can reasonably expect to_epsg
not to return a code for a custom CRS, the warning doesn't hurt. I think it is better to be explicit about why that happened than implicitly expect people to know why it failed.
For
to_wkt
, I would even consider making it an error (and propagate the actual error message from PROJ).
I agree with this but that should be probably done with a deprecation period if we decide to go that way.
While you can reasonably expect
to_epsg
not to return a code for a custom CRS, the warning doesn't hurt.
For end users, the warning indeed doesn't hurt. But for library users like the example that @snowman2 gave above (https://github.com/opendatacube/odc-geo/blob/d1e45e366a511cf690d36155daccac73241e52eb/odc/geo/crs.py#L34-L38, and we do something similar in pyogrio), that would mean they would need to start catching such warning.
Then the question is who want pyproj primarily serve. If an end user, then I'd keep the warning. If it is meant to be low-level library primarily consumed by others, hence serving developers, then let's remove it. I think that it is somewhere in between now, esp. in a way in which it is used in geopandas, where we use it internally but expose the CRS
as gdf.crs
directly.
I am happy with whatever @snowman2 prefers here.
pyproj is used by users and libraries and both are equal in importance. As demonstrated by geopandas
, CRS
is used under the hood and also exposes it to users by libraries. This is a common pattern I have encountered.
Since this issue is focused on WKT and we have consensus that a warning/error should be raised when it is None, I propose we focus the discussion on to_wkt
. I think a separate issue should be opened for to_epsg|to_authority
as that seems like it needs more thought/discussion.
Current proposed solution for to_wkt
:
- Keep the docstring/typing so that None return type is not expected.
- Warn when None return is encountered and raise a warning (probably a good idea to note in the warning that a CRSError will be raised in the future).
- In a future release, replace the warning with a CRSError (extra bonus is that it will include the PROJ error message in the exception).
I am thinking that this pattern would also be good to follow to to_proj4
and to_json
just to be consistent.
What do y'all think?
That sounds good. I'll adapt #1037 accordingly.