Jérôme Dockès
Jérôme Dockès
about `skb.do` I'm not strictly opposed but it is starting to look a bit noisy & cryptic for someone who lands in the middle of an example and is not...
> About compute_ngram_distance, should we set it as private? I'm in favor of making it private. It is too low-level and does not do enough to be useful for users...
> Here the first skb is the main namespace and the second is the dataops namespace and I think that would be very confusing to the users. I also prefer...
I don't think it helps users to give the module a meaningless name. Only we can guess that "do" stands for "data ops", and we don't usually give those kinds...
Regarding highlighting / directing attention, here are the ones I would de-prioritize: - All the joiners: use data ops instead. The joiners are unusable in many cases because: - they...
That's great @Neilblaze ! I think it would be best to wait for #1233 to be merged before we start working on this one, because #1233 will influence what we...
Thanks @Vincent-Maladiere those are both great ideas. for the 1st one, I would go for something that returns just a list of names, or that returns a mapping from name...
we could even have a `.skb.make_docstring(summary)` that collects all the variable names, descriptions and any relevant info and produces a text similar to the docstring you would write if the...
about the name for `get_var_names()`, `get_data_schema()` etc whatever name we pick should be quite different from `get_params()` to avoid confusing it with the scikit-learn get_params
> I'm working on the first idea cool :) you can get the nodes easily with `_evaluation.nodes`: ``` >>> import skrub >>> from skrub._expressions._evaluation import nodes, Var >>> e =...