pyogrio
pyogrio copied to clipboard
GDAL errors are also "printed" to stdout in addition to raised as Python exception
A small example:
import pyogrio
import geopandas
from shapely.geometry import Point, LineString, Polygon, box, MultiPolygon
df = geopandas.GeoDataFrame(
{"col": [1, 2, 3]},
geometry=[Point(0, 0), LineString([(0, 0), (1, 1)]), box(0, 0, 1, 1)],
crs="EPSG:4326",
)
In [3]: pyogrio.write_dataframe(df, "test_pyogrio_mixed.shp", driver="ESRI Shapefile")
ERROR 1: Attempt to write non-point (LINESTRING) geometry to point shapefile.
---------------------------------------------------------------------------
FeatureError Traceback (most recent call last)
<ipython-input-3-f50fe8b379fe> in <module>
----> 1 pyogrio.write_dataframe(df, "test_pyogrio_mixed_truly.shp", driver="ESRI Shapefile")
~/scipy/repos/pyogrio/pyogrio/geopandas.py in write_dataframe(df, path, layer, driver, encoding, geometry_type, **kwargs)
248 crs = geometry.crs.to_wkt(WktVersion.WKT1_GDAL)
249
--> 250 write(
251 path,
252 layer=layer,
~/scipy/repos/pyogrio/pyogrio/raw.py in write(path, geometry, field_data, fields, layer, driver, geometry_type, crs, encoding, **kwargs)
186 )
187
--> 188 ogr_write(
189 str(path),
190 layer=layer,
~/scipy/repos/pyogrio/pyogrio/_io.pyx in pyogrio._io.ogr_write()
FeatureError: Could not add feature to layer at index 1: Attempt to write non-point (LINESTRING) geometry to point shapefile.
In the above case, we detect that GDAL errors, get the error message and include that in the Python exception we raise. But, you can also see that the original GDAL error is still printed as well (the first line before the traceback, ERROR 1: ...
).
This is not that a big deal, but I think ideally we would only have the exception (for example, if you catch the exception to handle it in some way, you still get the printed message). And, when using fiona, this doesn't happen (df.to_file("test_fiona_mixed.shp", driver="ESRI Shapefile")
is the equivalent of the above pyogrio write, and this only gives a python exception).
I have been wondering why this is (we do use CPLErrorReset
after getting the error type and message). Comparing to Fiona, the error handling itself is very similar (since copied from Fiona), but in Fiona there is in addition also a error handler registered (CPLPushErrorHandler
):
https://github.com/Toblerity/Fiona/blob/9c8fe736d4381526905c6976c399a33d10e3ecbf/fiona/_env.pyx#L137-L155 https://github.com/Toblerity/Fiona/blob/9c8fe736d4381526905c6976c399a33d10e3ecbf/fiona/_env.pyx#L387-L395
I noticed that the GDAL python SWIG bindings have this handler defined, which I suppose is basically what we want as well (hide errors logs because those are already handled as python exception):
static CPLErrorHandler pfnPreviousHandler = CPLDefaultErrorHandler;
static void CPL_STDCALL
PythonBindingErrorHandler(CPLErr eclass, int code, const char *msg )
{
/*
** Generally we want to suppress error reporting if we have exceptions
** enabled as the error message will be in the exception thrown in
** Python.
*/
/* If the error class is CE_Fatal, we want to have a message issued
because the CPL support code does an abort() before any exception
can be generated */
if (eclass == CE_Fatal ) {
pfnPreviousHandler(eclass, code, msg );
}
/*
** We do not want to interfere with non-failure messages since
** they won't be translated into exceptions.
*/
else if (eclass != CE_Failure ) {
pfnPreviousHandler(eclass, code, msg );
}
else {
CPLSetThreadLocalConfigOption("__last_error_message", msg);
CPLSetThreadLocalConfigOption("__last_error_code", CPLSPrintf("%d", code));
}
}
(copying the snippet here, because doesn't allow to link to the line: https://github.com/OSGeo/gdal/blob/master/swig/python/extensions/ogr_wrap.cpp)
There is also a builtin CPLQuietErrorHandler
(https://gdal.org/api/cpl.html#_CPPv420CPLQuietErrorHandler6CPLErr11CPLErrorNumPKc), but that also hides warning messages.
This also affects warnings. Ideally, you'd be able to supress warnings using the stdlib, but since those are simply logged from gdal there is no way to do so. The annoying one I'm facing:
Warning 1: Value 113034553.399706051 of field AREA of feature 3746 not successfully written. Possibly due to too larger number with respect to field width
@Oreilles did you figure out a way to solve that? I write the same file with fiona and it doesn't complain. But pyogrio does. Would love to write at the speed of pyogrio.
I assume this negatively impacts reading the files afterwards as well? Are the values NULL? Or are they truncated floats? Would be good to know what happens.
@brendan-ward any ideas?
I could not fix the bug, however, I found it was possible to redirect the GDAL output elsewhere so as to not pollute my program output. Here is the code to just throw GDAL's output away (Change dev/null
to any file path if you want to preserve it somehow):
from pyogrio import set_gdal_config_options
set_gdal_config_options({"CPL_LOG": "/dev/null"})
I assume this negatively impacts reading the files afterwards as well? Are the values NULL? Or are they truncated floats? Would be good to know what happens.
All values were already correctly written for me, the warning was unexpected which is why I decided to suppress it. Of course this comes with the downside that if a valid warning were to be emitted I would miss it.
@codeananda which version of pyogrio are you using? Because my understanding is that the warnings issue should also be solved by https://github.com/geopandas/pyogrio/pull/242 (and I know in our CI we use pytest.mark.filterwarnings
to ignore some of them)
We had this specific warning in the geopandas tests as well, and added a filter for this (so filtering specifically this warning with python warnings.filterwarnings
is certainly possible). See https://github.com/geopandas/geopandas/pull/3040#discussion_r1349713787 for some discussion.
I don't really understand why GDAL raises this warning, but our conclusion was also that it is probably harmless (you could open an issue about it upstream in osgeo/gdal)
Wow thanks for the super speedy responses!
I'm using pyogrio 0.7.2.
Good to know that the values are actually written correctly.
@jorisvandenbossche maybe I'm missing something but it seems like #242 is related to suppressing warnings for tests... but are these warnings actually suppressed in the code?