Discussions towards better stability of core pieces of geoarrow-rs
From https://github.com/geoarrow/geoarrow-rs/issues/1015#issuecomment-2776800398
I think there are a few problems with geoarrow-rs:
- there's a wide variety of code at different states of production-readiness
- Partly due to me trying to do too much, struggling a bit with how best to model GeoArrow in general, and also learning Rust through this whole process, there's a whole lot of code that is decidedly not production ready.
- Because it's all one Rust crate
geoarrow, there's no clear lines between what is (closer to) production-ready and what is not.
I think a way to break through this impasse is to select relatively small, well-defined subsets of GeoArrow functionality and break them into subcrates. For one, this forces more thought about public APIs because across crates you can't access any pub(crate) attributes. It lets us more clearly document which subsets we expect to be more stable and tested. And external users like yourself can start to build on only those pieces without even bringing in the dependencies for the full geoarrow crate.
In a spectrum of more stable to less stable
- Core types conforming to the spec, like what is now in geoarrow-schema
- "primitive" Array layouts like Point/LineString etc
- "complex" Array layouts like Geometry and GeometryCollection
- Array builders
- Conversions between GeoArrow memory and
geo, WKB, and WKT - Reading/writing Parquet
- Reading/writing FlatGeobuf
- Chunked arrays (should maybe remove)
- Table concept (should probably remove)
- Conversions between GeoArrow memory and
geos - Geometry operations using
geo - Casting
- Geometry operations using
geos - Reading/writing other geo formats
- Reading/writing to PostGIS
Is there a well-defined subset of this project that you think you would use if it were more stable? Is there a piece that you're interested in that we could work on together to make stable?
Originally posted by @kylebarron in https://github.com/geoarrow/geoarrow-rs/issues/1015#issuecomment-2776800398
cc @paleolimbot
First, the whole geoarrow crate (and the ecosystem adoption it's largely behind) is awesome and any of my gripes should be taken with a grain of salt. The least useful thing I'll say is that all of these things are things that eventually should be enabled!
I think the absolutely essential bits are geoarrow-schema (mostly done!), iterate over by geo-traits (pretty sure this is somewhere), and build by buffer + validate (substantially easier than an arbitrary builder, I think).
I'm definitely happy to contribute some of these pieces although I'm not exactly sure of the timeline. I'm always happy to review, though!
One thing that may be worth considering is building the pieces up (e.g., geoarrow-schema, geoarrow-array, etc.) without refactoring geoarrow as you go. That would allow breaking changes if they're needed to scale back the scope and perhaps be a bit more fun (but maybe it's not bad to refactor as we go!)
As described in https://github.com/geoarrow/geoarrow-rs/pull/1097, the old geoarrow crate is being refactored into a monorepo of smaller crates.
I think this issue can be closed now.