Benoit Bovy
Benoit Bovy
I've started writing a `DatasetNode` class (WIP): https://gist.github.com/benbovy/92e7c76220af1aaa4b3a0b65374e233a Currently, this is a minimal class that just implements an "immutable" tree of datasets (it only allows adding child nodes so that...
Yes I'm actually not very happy with the `.dataset` attribute for accessing the underlying dataset object. On the other hand, similarly to `h5py` and `netCDF4`, I find it nice to...
Another difficulty regarding multi-coordinate indexes: ideally options should be set per index, not per coordinate.
Or we could simply decide that `.sel()` should not accept arbitrary options and handle special cases, e.g., via accessors. It would actually make sense to have something like `.my_accessor.sel_k_neighbors()`. Not...
Or use `Indexer` objects to group labels + options? This is slightly different than what you suggest: ```python class Dataset: def sel( self, indexers: Mapping[Any, Any] | Indexer | Iterable[Indexer],...
What happens if you create `Dataset` objects fully in memory instead of loading data from files? Is there a significant slowdown when you increase the size of the Dataset dimensions?...
> it is now allowed to provide array-like labels. Hmm not sure if it's a good idea... I find `get_locs()` a bit confusing like in the example below where a...
Looks like passing a `pandas.MultiIndex` object as `dim` argument to `concat` was forgotten during the explicit indexes refactor. While this can be fixed (could be tricky), we should deprecate it:...
I don't think we can use k-d tree with lat/lon haversine: https://news.ycombinator.com/item?id=9281998 The rest of the matrix is valid. Ball tree works with any metric that respects the triangle inequality.
Actually, there is already detailed comparison + benchmarks in [Jake's blog post](https://jakevdp.github.io/blog/2013/04/29/benchmarking-nearest-neighbor-searches-in-python/). But it'd still be interesting to see it applied to NEMO/FESOM spatial data patterns and the metrics/conversions used...