Import via buffer protocol
@H-Plus-Time I think this is a bit cleaner than what you have in #56
Much nicer - yeah, I was very wary of transmute, this looks like a better alternative :smile: .
With the GIL lock, that pretty much only applies at buffer protocol import time, yeah?
Vectorized search would be a nice thing to have (unless it's obviated by tree to tree intersection candidates), and I'd imagine GIL locking would be a pain there.
With the GIL lock, that pretty much only applies at buffer protocol import time, yeah?
In theory, I think you're supposed to hold the GIL for the entire lifetime of the slice. But IMO I think it would be ok to make clear in the documentation that the input must not be mutated.
Vectorized search would be a nice thing to have
You mean searching a tree for lots of boxes at once, not just a single box? We could provide a naive implementation that does the loop in Rust. I suppose that's a separate concept from tree to tree intersection.
In case you're looking for a place to contribute and/or it's something you want to use 😄 , I implemented tree to tree intersection myself (as in, not something I ported from flatbush.js) and it appears to have a bug that I haven't had time to look into. See https://github.com/kylebarron/geo-index/issues/42 and https://github.com/kylebarron/geo-index/pull/51. Though it might not be trivial to fix
I think I tried to generalize too soon and it became more work than I expected 😢 . Probably not worth trying to support non-float box types yet
Also I was thinking of trying to support both numpy and arrow input, but that seems a little hard. In general I'm trying to move towards arrow-first, but it seems wrong to not support arrow here.
It would probably be simplest to separate out the constructors for numpy and arrow. The consumer can deal with that. So we'd have
RTree.from_interleaved
RTree.from_separated
RTree.from_arrow
where the from_arrow version will look for a geoarrow box array.
I think I tried to generalize too soon and it became more work than I expected 😢 . Probably not worth trying to support non-float box types yet
Yup, that's midnight-brain, I know it well 😬
It would probably be simplest to separate out the constructors for numpy and arrow. The consumer can deal with that. So we'd have
RTree.from_interleaved RTree.from_separated RTree.from_arrowwhere the from_arrow version will look for a geoarrow box array.
Yeah, may as well - the python tendency toward methods that take basically anything (as in lists of dicts OR dicts of lists OR random type X) and flag out how it's interpreted... Is maybe not a great thing to perpetuate.
Often my midnight brain is very productive 😛
Yeah, may as well - the python tendency toward methods that take basically anything (as in lists of dicts OR dicts of lists OR random type X) and flag out how it's interpreted... Is maybe not a great thing to perpetuate.
I'm definitely not a fan of accepting any kind of input. Especially now that we have good type checking in Python, each method should take only one kind of input. I was mostly thinking that arrow and numpy arrays could be interchangeably accepted. But I think separating into from_arrow is fine
In https://github.com/kylebarron/arro3/pull/204 I figured out a way to have Python-backed Arrow Array objects. So we can interpret a numpy numeric array as a non-null Arrow array without any memory copies.
The core geo-index crate doesn't use Arrow, but it may be worth it to use pyo3-arrow in the python bindings, because then we can accept either numpy or arrow input for the tree
And in https://github.com/kylebarron/arro3/pull/208 I automatically handle multidimensional numpy arrays as input, converting them (zero-copy) to nested fixed size lists
Latest update removes a lot of code by reusing the PyArrowBuffer from pyo3_arrow::buffer. See https://github.com/kylebarron/arro3/pull/241 and https://github.com/kylebarron/arro3/pull/242.
Work still to do:
- Update to latest git version from https://github.com/kylebarron/arro3/pull/242 to fix the
PyArrowBufferconstructor. - Update
from_interleavedandfrom_separatedto usePyArrayinput.