geo
geo copied to clipboard
Implement 3D and Measure support for geo types
UPDATE: most of this pr has been moved to #797 (all of geo-types changes). 797 should be merged and published first. This PR will need to be updated once 797 is published. I will mark this as a draft in the mean time.
This PR changes the underlying geo-type data structures to support 3D data and measurement values (M and Z values). The PR attempts to cause relatively minor disruptions to the existing users (see breaking changes below). My knowledge of the actual geo algorithms is limited, so please ping me for any specific algo change.
Many other pending PRs are included here - most of them can be merged without affecting existing users, and will make reviewing this PR easier. See the list below.
geo-types restructuring
All geo type structs have been renamed from Foo<T>(...) to FooTZM<T,Z,M>(...), and several type aliases were added:
// old
pub struct LineString<T: CoordNum>(pub Vec<Coordinate<T>>);
// new
pub struct LineStringTZM<T: CoordNum, Z: ZCoord, M: Measure>(pub Vec<CoordTZM<T, Z, M>>);
pub type LineString<T> = LineStringTZM<T, NoValue, NoValue>;
pub type LineStringM<T, M> = LineStringTZM<T, NoValue, M>;
pub type LineStringZ<T> = LineStringTZM<T, T, NoValue>;
pub type LineStringZM<T, M> = LineStringTZM<T, T, M>;
NoValue magic
NoValue is an empty struct that behaves just like a number. It supports all math and comparison operations. This means that a Z or M value can be manipulated without checking if it is actually there. This code works for Z and M being either a number or a NoValue:
pub struct NoValue;
impl<T: CoordNum, Z: ZCoord, M: Measure> Sub for CoordTZM<T, Z, M> {
type Output = Self;
fn sub(self, rhs: Self) -> Self {
coord! {
x: self.x - rhs.x,
y: self.y - rhs.y,
z: self.z - rhs.z,
m: self.m - rhs.m,
}
}
}
Variant algorithm implementations
Function implementations can keep just the original 2D <T> variant, or add support for 3D <Z> and/or the Measure <M>. The above example works for all combinations of objects. This function only works for 2D objects with and without the Measure (Z is set to NoValue by LineM type alias.
impl<T: CoordNum, M: Measure> LineM<T, M> {
/// Calculate the slope (Δy/Δx).
pub fn slope(&self) -> T {
self.dy() / self.dx()
}
}
Breaking changes
Tuple constructors
It will not be possible to use implicit type constructor for tuples because type aliases do not support them. Most geo types will have a new(...) fn if they didn't have one before.
// old
MultiPoint(vec![...]);
...iter().map(MultiPoint);
// new
MultiPoint::new(vec![...]);
...iter().map(MultiPoint::new);
Type destructors
Destructuring assignments seem to require real type rather than an alias.
// old
let Point(c) = p;
// new
let PointTZM(c) = p;
Related PRs
This PR includes #778, #782, #783, #784, #786, #787, #788, #775, #772, #785 which can be easily merged without breaking existing code.
TODO
- [ ] implement some simple algorithm that works with 2d/3d/keeps measurement values
- [ ] implement some simple SRID example
Wow! It's astonishing that such a major extension is not more intrusive. For geo-types users, the major changes are probably that they have to use new for construction and x() and y() for access.
One point which should be prototyped probably, is how to implement 2D functions keeping additional dimensions and meta data.
Naming:
- I would prefer an other word for "Metadata"/"meta". Additional dimensions like "measurement" or "time" are not "meta" for me. There are things like an srid which I would consider as "meta", though.
- For
Point3Dor similar, I would prefer the same naming as PostGIS:PointZorPointZM
Yeah, this looks great.
But I wonder how it affects the IDE support? rust-analyzer knows almost nothing about generics.
But I wonder how it affects the IDE support? rust-analyzer knows almost nothing about generics.
I haven't had any issues with it in IntelliJ - seems to work well -- e.g. if there is a set_x() implemented just for the D = 2 generic const, it suggests set_x as an available option, whereas it hides it for D = 3. I'm not an expert in VS Code, and it seems the Rust plugin is a bit dated (last commit was in 2021), so can't really say.
Naming is hard. @pka I renamed Coordinate3D to CoordinateZ. What should we do about metadata? There are two types of info:
- data that affects how coordinates are handled, i.e. SRID
- user-supplied tag data that algorithms do not handle, but try to keep around, i.e. measurements/times/extra tags/...
SRID could (later?) be introduced as a separate generic field, with some other trait magic that implements a default. The tag data could be called something else: tag, extra, info, custom, ...?
Disclaimer that I've only given this a cursory look. There's a lot of noise in the diff from the upstream commits.
re: the naming of Coord that you've proposed - there is some previous discussion about using that word elsewhere. See https://github.com/georust/geo/issues/540.
As such, it would be good to reserve that nice short name for the place it's likely to be used the most.
A couple of things I don't understand yet:
-
Is the proposed
metadatafield actually required to implement the 3d/4d coordinates, or is that a separable change? -
Is your intention to add 3d/4d for all geometry types - e.g.
LineString3D,LineString4Dor only for the Coordinates?
re: the naming of
Coordthat you've proposed - there is some previous discussion about using that word elsewhere. See #540. As such, it would be good to reserve that nice short name for the place it's likely to be used the most.
Sure, any name is fine here - MultiCoord, CoordStore, GenericCoord, ... cast your votes :)
- Is the proposed
metadatafield actually required to implement the 3d/4d coordinates, or is that a separable change?
No, metadata (to be renamed per above) is not required for 3D, but seems to be requested enough (and common enough in the industry) to introduce at the same time. This reduces code churn because adding M generic is identical to adding D generic, so the same code spots will be changed twice.
- Is your intention to add 3d/4d for all geometry types - e.g.
LineString3D,LineString4Dor only for the Coordinates?
The intention is to add LineStringZ and the rest of them. (per PostGIS convention as also suggested above). I am not an expert in geo algorithms, so this is more of a basis for experimentations, to see how we can adapt existing algorithms to work on multidimensional features, and to possibly introduce 3D-specific ones. This also experiments with adding the extra data, to see which algos can keep that data around if possible.
No, metadata (to be renamed per above) is not required for 3D, but seems to be requested enough (and common enough in the industry) to introduce at the same time. This reduces code churn because adding M generic is identical to adding D generic, so the same code spots will be changed twice.
Thanks for your consideration of breaking changes. It is indeed nice to consolidate breakage.
Is it clear to you what the behavior of a simple algorithm like bounding_rect would do with respect to the metadata field? Is it just always default?
type MyCoord = Coord<f64, char, 2>;
let a = MyCoord::new(1.0, 2.0, 'a');
let b = MyCoord::new(2.0, 8.0, 'b');
let b = MyCoord::new(10.0, 9.0, 'c');
let ls = line_string![a, b, c];
let rect = ls.bounding_rect();
// Is this what you're planning?
assert_eq!(rect.meta(), Default::default());
It seems kind of at odds with it's utility. e.g. if I'm holding my SRID in meta, maybe I expect that to be preserved in the meta of the output geometry?
You don't have to respond right now, I'm sure you're still thinking about the design. I just hope to avoid a situation where, in the effort to accommodate two new things, it becomes too challenging and then we end up with neither. Spend your time as you like, of course.
@michaelkirk actually see https://github.com/georust/geo/pull/742#issuecomment-1048004626 -- i think SRID won't be part of the (what is now known as metadata, but shall be renamed). SRID affects algorithms. "meta" does not, so it is more like an extra/tag thing. And no, no idea of how it will be used -- need to look at how postgis deals with the measurement. One example is ST_AddMeasure. In that doc, note the important comment:
This function supports 3d and will not drop the z-index.
So in PostGIS every function capable of understanding z-index is marked as such.
Looking further at various M-related PostGIS functions like ST_FilterByM, they all work with M value that can be compared, extrapolated, etc. That implies that M should be used as where M: CoordNum, but for flexibility it should be separate from the x/y/z type T: CoordNum. SRID on the other hand seems to always be an integer, so we could probably introduce that using the same S: CoordNum construct. The reason for generic is because it allows () as a valid parameter type - effectively removing it from the storage
After a lot of experimentation, it seems the constant-generic-based array would be too difficult and pointless to implement. Instead, per a wonderfully quick stackoverflow suggestion, I'm introducing a new NoValue struct with no fields. Using it as a type for z and m values allows far fewer changes - must still use coord! { x, y } instead of Coordinate { x, y }, but no longer need x(), y() functional accessors. Serde support is back in. Now if only i could somehow enforce the type of z to always be either NoValue or the same as the type of x/y... Not certain how to do that yet.
P.S. SRID should be done as one value per feature (point/line/...), not one per coordinate, so removing it for now.
Update: stabilizing coordinates and points. Most algorithms will not work if the point/coord has z or m values because algorithms are accepting/returning Point<T> instead of PointTZM<...>. There is one test that shows multi-dimensional usage - contains(). Also note that PointTZM must now be used everywhere to create new instances, e.g. Point(c) is now invalid, and must be replaced with point!(c) or PointTZM(c). This is a trade-off to keep existing function and var declaration signatures like fn foo() -> Point<T>.
Alternatively we can keep the existing Point and extend its generic params: Point<T, Z, M>, plus create type aliases like Point2D<T>. This would allow constructor usage like Point(c), but all var declarations and function returns would have to be changed to either fn foo() -> Point<T,Z,M> or fn foo() -> Point2D<T>.
To simplify migration, some of these changes were moved to https://github.com/georust/geo/pull/775
/// Test that actually works but only if all types match
assert!(point!(x: 1, y: 2, z: 3, m: 4).contains(&point!(x: 1, y: 2, z: 3, m: 4)));
/// Coordinate storage
pub struct CoordTZM<T: CoordNum, Z: ZCoord, M: Measure> {
pub x: T,
pub y: T,
pub z: Z,
pub m: M,
}
pub type Coordinate<T> = CoordTZM<T, NoValue, NoValue>;
pub type CoordinateM<T, M> = CoordTZM<T, NoValue, M>;
pub type CoordinateZ<T> = CoordTZM<T, T, NoValue>;
pub type CoordinateZM<T, M> = CoordTZM<T, T, M>;
// points
pub struct PointTZM<T: CoordNum, Z: ZCoord, M: Measure>(pub CoordTZM<T, Z, M>);
pub type Point<T> = PointTZM<T, NoValue, NoValue>;
pub type PointM<T, M> = PointTZM<T, NoValue, M>;
pub type PointZ<T> = PointTZM<T, T, NoValue>;
pub type PointZM<T, M> = PointTZM<T, T, M>;
Update: TZM support has been implemented on all geo types, and many simple implementations for those types. Some open questions:
- [ ] How should coordinate math be handled? e.g.
c1 + c2doesc1.x+c2.x, c1.y+c2.y. What should it be when z and/or m are given? Same for divide/multiply/negate/subtract
This PR is ready for review. I updated the top comment with lots of details and examples. Reviewing should become much easier after related non-breaking PRs are merged (see list at the top)
This PR now removes all WKT code, and creates a wrapper for unit tests that converts from legacy to new geo-types (per @lnicola suggestion).
Sadly, bumping versions introduced a new unanticipated problem. geo / transform.rs re-exports code from proj crate, which uses geotypes. So this PR would not compile because geo uses new geo-types, but geo also wants to embed proj that uses old geo-types.
One option is to merge and publish changes to geo-types as a new breaking release, without upgrading geo code, but that won't pass CI because geo uses relative path to ../geo-types/ in several Cargo.toml and also overrides it in [patch.crates-io]. It seems we may have to move geo-types into a separate repo.
Do you have an example CI run with the proj-related failure? I don't know if it's as "easy" to fix as the wkt one, but the tests appear to be green.
Do you have an example CI run with the
proj-related failure? I don't know if it's as "easy" to fix as thewktone, but the tests appear to be green.
@lnicola try cargo test --features use-proj from the /geo dir:
error[E0599]: no method named `transform_crs_to_crs` found for struct `RectTZM` in the current scope
--> geo/src/algorithm/transform.rs:17:14
|
17 | .transform_crs_to_crs("EPSG:2230", "EPSG:26946")
| ^^^^^^^^^^^^^^^^^^^^ method not found in `RectTZM<f64, NoValue, NoValue>`
If I'm looking at it right, that test is trying to check something about the current geo-types by checking a geo-types value as re-exported by the proj crate. So it's not testing the current geo-types, but the one used by proj.
I'd be in favour of dropping the test.
Are all the changes in this pull request isolated to geo-types? If so, we can just temporarily drop the path = "../geo-types" part in geo's Cargo.toml, and instead just lock it to the published crates.io version. That should resolve this issue, and we shouldn't need to delete any tests or move any repos.
Doesn't that make things slightly harder when we need to make a breaking change? We have to publish geo-types first, update geo and any other crates, then publish them, too. It's workable, but a bit strange.
What I meant above is that the test is actually for a different version of the Point type (not the one in the current workspace), so it makes sense to check that it works where it's actually implemented.
If I'm looking at it right, that test is trying to check something about the current
geo-typesby checking ageo-typesvalue as re-exported by theprojcrate. So it's not testing the currentgeo-types, but the one used byproj.I'd be in favour of dropping the test.
I don't think we can drop it -- that test indicates a problem -- if your code uses geo crate, and uses projections from it (instead of going directly to the proj crate), your code will also not work.
I was thinking about limiting this PR to just the geo-types changes, but there are a lot of other places (various Crates.toml files) that either use a relative path to geo-types, or patch the [crates.io] with a relative path. We would have to modify all of them. Once the PR is merged and the new version of geo-types is published, we would have to revert all of them. It is a viable path if we want to take it
I don't think we can drop it -- that test indicates a problem -- if your code uses geo crate, and uses projections from it (instead of going directly to the proj crate), your code will also not work.
Yeah, that's true. Maybe the implementation of this should live in geo, then?
I was thinking about limiting this PR to just the geo-types changes, but there are a lot of other places (various
Crates.tomlfiles) that either use a relative path togeo-types
I guess I'm confused why we can't remove the path line in each dependent Cargo.toml file. This is what we've done in the past when we've had breaking changes and it worked fine.
I guess I'm confused why we can't remove the
pathline in each dependentCargo.tomlfile. This is what we've done in the past when we've had breaking changes and it worked fine.
@frewsxcv seems like it is not possible until the current geo-types is published. Current main code for geo and other modules already uses some of the unpublished functions.
Ugh yeah that is tricky since there's an unpublished geo-types breaking change, so we can't just release a new minor version. One option is we could release 0.8.0-beta.0 for geo-types right now, and then you could pin geo to that. Not the best, but it is an option.
sure, that would work. Could you do that?
@frewsxcv also, if it is a breaking change, can we also merge #772 before beta-publishing?
If releasing geo-types 0.7.4 solves this problem, that seems fine. It'd be good to have a release anyway, but I'm a little confused as to what the problem is. Typically these problems can be addressed by sufficiently munging [patch.crates-io].
So that I know what the end goal is... is it sufficient if I can get cargo build --features use-proj to work with this branch?
@michaelkirk yep, the goal is to pass CI (feel free to push directly to my branch)
@nyurik -
I don't see where you've patched the WKT dependency. Are you intending that it use https://github.com/nyurik/wkt/tree/wkt-dim?
I don't see where you've patched the WKT dependency. Are you intending that it use https://github.com/nyurik/wkt/tree/wkt-dim?
@michaelkirk I didn't need to patch WKT - there is a workaround (very last file in the PR) -- jts-test-runner/src/input.rs, which uses published wkt as is, and just converts the result.
P.S. So no, I don't intend to use that branch at all. Actually I think I will modify WKT at some point later to get rid of the internal geo types, and just use the main geo-types for everything