datatree
datatree copied to clipboard
Feature Request: Some form of path auto-completion
The more I play with datatree, the more I realize the huge potential of this package. Amazing work @TomNicholas.
I have been using dt for CMIP6 analysis and one of the things that is still fairly slow for me is to navigate into different specific leaves of the tree. Having some sort of incremental auto-complete of the path like this:
dt.['model/<click tab> --> dt.['model/experiment_a/]
similar to what happens in the bash shell with tab completion would be an absolute killer feature. Most users already know how to navigate this (from their terminal). For massive trees this would speed up the workflow tremendously.
On the other hand I have no idea how complex this is to implement. Just wanted to suggest this feature.
I think that auto-completing strings like this is not possible in python at all. (I would be happy to be proven wrong though.)
What we can do instead is provide attribute-like access to child nodes, and have them autocomplete, in the same way that xarray already lets you autocomplete variable names. Then your example would work like this
dt.model.<click tab> --> dt.model.experiment_a
We can't do filesystem-like syntax like "../" like this though.
I already have a PR where I started doing this but it stalled because the implementation in Xarray I need to copy was a bit hard for me to understand. If anyone else wants to have a go before I get back to it then please do!
I think that auto-completing strings like this is not possible in python at all. (I would be happy to be proven wrong though.)
Pandas seems to have figured it out:
import pandas as pd
df = pd.DataFrame()
df['foo_bar'] = [0, 1, 2, 3]
df['foo_ <tab> # --> df['foo_bar
Not 100% sure how they do this and its possible that Ipython is doing a lot of the heavy lifting.
Oh wow that's cool! Will have to look at how they manage that wizardry
IPython looks for _ipython_key_completions_() for key based access.
Xarray already has some support: https://github.com/pydata/xarray/blob/6e77f5e8942206b3e0ab08c3621ade1499d8235b/xarray/core/coordinates.py#L342-L348
It's working!

(This isn't going to work for ../, not because I can't do it, but because there are any infinite number of ways to get to a given node if you allow ../ ...)
to support . and .. I believe you need some sort of path normalization.
Something like PosixPath.resolve but on NodePath, maybe? That converts to absolute, though, so we'd need to have a way to pass the current node's path? Or have one method that normalizes the path (but keep it relative) while the other uses a given path to convert to absolute?
@keewis apologies if I misunderstood, but I think you're misunderstanding what already works and what is being proposed as a feature here.
. and .. in path-like access via __getitem__ etc. already works, in main right now. Its in the docs here. It works in a different way from what you're suggesting, but maybe we could do it more neatly... NodePath does already have a .resolve method, which it inherits from pathlib.PurePosixPath.
What I'm talking about is . and .. in paths suggested by ipython key completion. The problem here is that _ipython_key_completion_ functions by expecting a list of all possible valid keys up front, then suggesting them all when it detects uncompleted string key access like user_class['. Requiring all possible paths up front means it can't specifically detect part of a string path (e.g. dt['some/relative/../path) and then change its auto-completion suggestions based on the string written out so far. All it is doing is narrowing down the options you might mean out of the finite set of valid keys you originally suggested.
So what I've done so far in #98 is include direct paths to subtree nodes in the _ipython_key_completions list, e.g. "path/to/node" and "path/to/node/below". But to support auto-completing with .. would require me to include all of {"path/to/../to/node", "path/to/../to/../to/node", "path/to/../to/../to/../to/node", ...}, which is an unlimited set.
Oh wow ipython's key completion code literally special-cases numpy and pandas explicitly :exploding_head:
What I'm talking about is . and .. in paths suggested by ipython key completion.
There is an issue on ipython that if solved would actually allow this feature fully.
I've merged #98 but will keep this open to track the nice-to-have feature of auto-completing relative paths as strings.
I think it would probably be reasonable to limit .. and . to the start of a path, which should limit the set size to the current node and it's subpaths, and it's ancestors.
Moved to https://github.com/pydata/xarray/issues/9440