gdal icon indicating copy to clipboard operation
gdal copied to clipboard

Add geos predicat methods for `Geometry`

Open dmarteau opened this issue 4 years ago • 24 comments

  • [x] I agree to follow the project's code of conduct.
  • [x] I added an entry to CHANGES.md if knowledge of this change could be valuable to users.

  • Add geos based predicat methods for Geometry:

    • intersects
    • disjoint
    • touches
    • crosses
    • within
    • contains
    • overlaps
  • Add build test for geos availability: have_geos

dmarteau avatar Mar 16 '21 12:03 dmarteau

@michaelkirk thanks for the review

Outside of a sys-crate, I'm surprised for methods to become defined or not based on something other than a feature flag.

The problem here is that the absence/presence of the geos based method does not depend only on rust code base but on how gdal is computed on the platform.

So what you suggests is to let users set the flags but eventually throw a build error if the methods if gdal was not compiled with Geos support ?

dmarteau avatar Mar 22 '21 08:03 dmarteau

The problem here is that the absence/presence of the geos based method does not depend only on rust code base but on how gdal is computed on the platform.

I understand that if the user wants to use gdal's geos methods, the user's build of libgdal needs to have been built with geos support, which happens outside of rust.

Your proposal makes sense for people that get everything set up correctly. I'm considering what happens for people who get it wrong - are we providing them with a useful chain of error messages?

In your current proposal, if the user tried to use one of these geos-backed methods, they would see something like:

fn envelope not defined

But I think what we want to tell them is something like:

This function requires gdal to be built with geos support.

I don't see how that's possible with the current proposal, but we could get something close using features.

I think users are more likely to understand that fn envelope is missing because of a --features misconfiguration rather than as a side-effect of what flavor of libgdal was around during the build.

So what you suggests is to let users set the flags but eventually throw a build error if the methods if gdal was not compiled with Geos support ?

Right, yes. If the feature is set but libgdal does not have geos support, we'd still error, but can show a useful error message.

I'd also recommend adding some doc(cfg), so that docs.rs can properly annotate. Something like:

e.g. https://docs.rs/proj/0.22.0/proj/struct.ProjBuilder.html#impl Screen Shot 2021-03-22 at 9 08 40 AM

michaelkirk avatar Mar 22 '21 17:03 michaelkirk

Do you know if it possible to detect in the build.rs script at compile time if the feature 'with_geos' is activated by the user ?

Or should we panic at runtime (not so nice) ?

In most case on most platform gdal is built with geos, so may be the geos option should be an opt_out (i.e without_geos) ?

dmarteau avatar Mar 23 '21 09:03 dmarteau

I see an issue: how do you enable feature in tests ? How can we tests the geos methods if only enabled by a feature ?

dmarteau avatar Mar 23 '21 09:03 dmarteau

@michaelkirk More I think about this, more I do not see very much difference between a function not available because gdal is not of the required version and function not available because Gdal is not compiled with GEOS: this is platform dependent and user cannot do very much about this.

In this line of thought having the error fn delaunay_triangulation not defined because I have the wrong version of Gdal installed is exactly the same as fn envelope not defined because Gdal is not build with GEOS support. To be consistent the version should also be a user defined feature/requirements.

dmarteau avatar Mar 23 '21 10:03 dmarteau

I see an issue: how do you enable feature in tests ? How can we tests the geos methods if only enabled by a feature ?

You can declare required features for tests in cargo.toml. Example:

[[test]]
name = "geojson"
path = "tests/geojson.rs"
required-features = ["with-geojson"]

pka avatar Mar 23 '21 14:03 pka

@michaelkirk More I think about this, more I do not see very much difference between a function not available because gdal is not of the required version and function not available because Gdal is not compiled with GEOS: this is platform dependent and user cannot do very much about this.

That's a valid point. Maybe it's not worth it. 🤷

michaelkirk avatar Mar 23 '21 15:03 michaelkirk

Do you know if it possible to detect in the build.rs script at compile time if the feature 'with_geos' is activated by the user ?

I'd probably do something like:

if cfg!(feature = "geos") {
    assert(gdal_have_geos(), "libgdal must have been built with geos to use geo methods");
}

michaelkirk avatar Mar 23 '21 15:03 michaelkirk

That's a valid point. Maybe it's not worth it.

Imho, this could even be confusing: feature is used for leveraging optional capabilities, but in this case it would leverage nothing in case gdal was not compiled with GEOS, making such specification pointless.

May be the best we can do is printing a warning at compile time to let the user knows that some features will be missing and enhance the documentation with the fact that Gdal needs GEOS for these functions ?

dmarteau avatar Mar 23 '21 15:03 dmarteau

If you link GDAL dynamically, do you still need GEOS support at compile-time?

lnicola avatar Mar 23 '21 16:03 lnicola

If you link GDAL dynamically, do you still need GEOS support at compile-time?

Could you elaborate ?

dmarteau avatar Mar 23 '21 16:03 dmarteau

I meant that the build and run-time environment can be different, so feature-detection at compile-time can fail, but the resulting binary can still work (and same in reverse). Especially here, where the a missing feature isn't reflected as a missing function (linker or loader error), but as an API that fails.

lnicola avatar Mar 23 '21 17:03 lnicola

I meant that the build and run-time environment can be different,

Yes, that defeat any compile time detection, and the same reasoning apply for the GDAL version number. But that's the way it is: if build a C/C++ code against a given library version and run it with a different version, in the best case you get a run-time link error and the only option you have to add a runtime check in the client code whenever its possible.

What are the option left ?

dmarteau avatar Mar 23 '21 17:03 dmarteau

It seems reasonable to return a Result<bool> here, since AFAIK the GEOS functions are fallible anyway (I know some of them crash in PostGIS for invalid geometries).

EDIT: other GEOS functions error on invalid geometries, I'm not sure about the predicates.

lnicola avatar Mar 23 '21 17:03 lnicola

It seems reasonable to return a Result

I'm not very enthusiastic with that option :

  1. It makes the api not very friendly and amha it has the wrong semantic.
  2. It is not useful in the way that users usings these methods are expecting it to work, and the will have no other choice to panic or to make runtime test at startup and abort if the function do not return true

dmarteau avatar Mar 23 '21 17:03 dmarteau

But C, C++ (or even Python) code calling these functions can check for errors in order to detect an incompatible GDAL and react in a reasonable way like printing an error message and quitting. By making these functions infallible, you:

  • make a compile-time check that is not relevant to the environment in which the program ends up running; availability of GEOS is a run-time, not compile-time attribute, so it shouldn't reflect in compile feature flags or configurations
  • make it hard for a developer less accustomed to GEOS and/or GDAL to detect this incompatibility (they can still call CPLGetLastErrorType if they know to look for it, but that's not exposed in gdal IIRC, only gdal-sys)
  • offer an arguably unidiomatic Rust API, as error handling is a pretty important part of the language, and in this case you're making it easy to swallow errors

lnicola avatar Mar 23 '21 18:03 lnicola

@lnicola I get your point,

But this is still bothering me:

availability of GEOS is a run-time, not compile-time attribute, so it shouldn't reflect in compile feature flags or configurations

As I said before, this is also the case with the version of gdal targeted, so why the reasonning with GEOS support does not apply to GDAL version ?

dmarteau avatar Mar 23 '21 20:03 dmarteau

It also applies to that. Two differences I can think of (without checking the source code) are:

  • GDAL sometimes removed functions from its public API, and the user might want avoid calling them by mistake
  • missing functions can prevent the program from starting, so it's good to have a way to limit the code to an older version

In comparison, the GEOS functions return a run-time error (that you would want to handle anyway) -- they're not #ifdeffed out.

lnicola avatar Mar 23 '21 20:03 lnicola

  • Removed #[cfg(have_geos)] compilation directive for geometry methods
  • Predicates return Result<bool>

dmarteau avatar Mar 24 '21 15:03 dmarteau

I wonder about performance issue:

For example if I need to handle geometries in batch mode:

  • In C/C+ code I would check once for geos availability then handle all geometries without extra check code.
  • In the current rust implementation, for each geometry I will have to pay extra call for test code.

dmarteau avatar Mar 24 '21 15:03 dmarteau

Yeah, that's a problem, but probably not such a large one. Do you mean the CPLErrorReset and CPLGetLastErrorType calls, or unwrapping the Result<bool>? In any case, the user can still call OGR_G_Disjoint and skip error checking, but that's not a good default for an API.


I also noticed that you're only checking for errors if the function returned false. While GDAL does return FALSE, in these cases, it's not documented, and not really part of the contract. The Python bindings, for example, check for error first.

lnicola avatar Mar 24 '21 16:03 lnicola

Yeah, that's a problem, but probably not such a large one. Do you mean the CPLErrorReset and CPLGetLastErrorType calls, or unwrapping the Result<bool>? In any case, the user can still call OGR_G_Disjoint and skip error checking, but that's not a good default for an API.

I mean call to CPLErrorReset and CPLGetLastErrorType .

I also noticed that you're only checking for errors if the function returned false. While GDAL does return FALSE, in these cases, it's not documented, and not really part of the contract. The Python bindings, for example, check for error first.

Actually the GDAL documentation states that in case of Error these methods always fail (i.e return FALSE for the predicates), this implies is that if the function return TRUE then no error has occured and a check is not necessary.

dmarteau avatar Mar 24 '21 16:03 dmarteau

Actually the GDAL documentation states that in case of Error these methods always fail (i.e return FALSE for the predicates), this implies is that if the function return TRUE then no error has occured and a check is not necessary.

I read it as "fails by setting an error code", not "by returning FALSE":

If OGR is built without the GEOS library, this function will always fail, issuing a CPLE_NotSupported error.

And generally it seems a bad idea to use a return value when an error might have been reported out-of-band.

lnicola avatar Mar 24 '21 16:03 lnicola

I read it as "fails by setting an error code", not "by returning FALSE":

mmh, I do not have the same intepretation, and the GDAL source code goes that way. I was concerned to take advantage of this to avoid some extra call.

dmarteau avatar Mar 24 '21 16:03 dmarteau

These were added by @metasim in #366 and #370. The versions there don't fail when GEOS is not available, but return the same thing as GDAL (e.g. a rough result or false). There is a way to query whether GEOS is enabled.

lnicola avatar Feb 14 '23 19:02 lnicola