geo icon indicating copy to clipboard operation
geo copied to clipboard

Allow extra data to be attached to geometries (proof of concept).

Open frewsxcv opened this issue 3 years ago • 1 comments

For a couple projects I'm working on, I'd like to be able to attach arbitrary metadata and properties to geometries. Right now that isn't possible with our geometry types, so this is a proof-of-concept of what that would look like. All geometry types (excluding the Geometry enum itself), could have a ext field, whose type would be specified by the Ext generic parameter, which defaults to () for better ergonomics.

Note that this is separate from the XYZ/XYZM work, as that extends the dimensions of the Coordinate type, and this proof-of-concept does not modify Coordinate.

Examples

type GpxMetadata = HashMap<String, String>;

fn parse_gpx(s: &str) -> geo_types::Geometry<f64> {
    ...
}

fn parse_gpx_with_metadata(s: &str) -> geo_types::Geometry<f64, GpxMetadata> {
    ...
}

frewsxcv avatar Sep 05 '22 17:09 frewsxcv

I think this is clever and is in pursuit of solving a real problem that I have. I can see how the approach you laid out could be useful in some cases, but I can also see how it would be confusing in others.

One example of where this could be confusing: When union'ing two polygons, which properties does the output retain?

In QGIS, there is a UI with default settings and discoverable ways to choose which properties are retained. I'm not sure what the corollary would be for something like this in a pure code solution.

Today, when I want a geometry and associated data, I use composition. That is, I build up some custom corollary of a Feature (as in https://www.rfc-editor.org/rfc/rfc7946#section-3.2), where one field is the Geometry and the other fields are the strongly typed associated "properties" I care about.

It does mean that I experience some tedium as I plumb through the geometric components of spatial operations. e.g. here's a spatial join of a City (metro area) that overlaps multiple States.

struct State {
    geometry: geo::Polygon,
    name: String,
    tax_rate: f64,
}

struct City {
    geometry: geo::Polygon,
    name: String,
    population: u64,
}

struct CityState {
    geometry: geo::MultiPolygon,
    // Names must distinguish between city and state
    city_name: String,
    state_name: String,

    tax_rate: f64,

    // We could keep a `population` field, but depending on our use case,
    // it might not make sense, since this struct's geometry represents only a 
    // slice (potentially an empty slice!) of the city.
    //
    // population: u64
}

impl City {
    // adding geometric functionality to our "Feature's" feels a bit tedious...
    fn intersection(&self, state: &State) -> CityState {
        CityState {
            geometry: self.geometry.intersection(state.geometry)
            // ... but ultimately what we name things and what fields we keep is often pretty arbitrary. 
            city_name: self.name.clone(),
            state_name: state.name.clone(),
            tax_rate: state.tax_rate,
        }
    }
}

// this metro area spans 3 states
let city: City = sioux_city();

let nebraska: State = nebraska();
let iowa: State = iowa();
let south_dakota: State = south_dakota();

let sioux_city_iowa: CityState = sioux_city.intersection(&iowa);
let sioux_city_nebraska: CityState = sioux_city.intersection(& nebraska);
let sioux_city_south_dakota: CityState = sioux_city.intersection(&south_dakota);

So that's how I'd solve the problem today. But what would it look like with the associated data in your PR? Did you have something like this in mind?

struct StateProps {
    name: String,
    tax_rate: f64,
}

struct CityProps {
    name: String,
    population: u64,
}

// this metro area spans 3 states
let city: Polygon<f64, CityProps> = sioux_city();

let nebraska: Polygon<f64, StateProps> = nebraska();
let iowa: : Polygon<f64, StateProps> = iowa();
let south_dakota: : Polygon<f64, StateProps> = south_dakota();

// FIXME: What type is the ???
let sioux_city_iowa: Polygon<f64, ???> = sioux_city.intersection(&iowa);
let sioux_city_nebraska: Polygon<f64, ???> = sioux_city.intersection(& nebraska);
let sioux_city_south_dakota: Polygon<f64, ???> = sioux_city.intersection(&south_dakota);

Note that Polygon<f64, CityProps> and Polygon<f64, StateProps> are different types. Would we even be able to call intersection between them given that they are different types?

If I understand, I think probably not. So one limitation of the proposal currently is that only types with the same "metadata" type could interact spatially. Is that your understanding as well?

Am I right that this has come up in your work on rgis? I can imagine where a "loosely typed bag of state" is probably a lot of what you're dealing with when your input is whatever the user supplies. Even with that limitation, I could imagine it being useful for things like GPX and GeoJSON where all the associated properties are a HashMap of "whatever", but it seems like a pretty severe limitation for those preferring stronger typing of their feature data.

I'd love to see this kind of thing get easier - it's a major source of boiler plate for me, but, as phrased, this is a pretty substantial change, so I'd definitely want to see some more working examples.

michaelkirk avatar Sep 06 '22 19:09 michaelkirk