Riccardo Cappuzzo comments

Results 216 comments of


                                            Riccardo Cappuzzo

Clarifying objects and terminology: "DataOps make Learners"

I've updated the starting example in #1481 to try and implement some of the new terminology

Clarifying objects and terminology: "DataOps make Learners"

> Thanks for your super thorough and helpful work. > > I think that this is a great suggestion overall. I think that I would go for DataOps rather than...

Clarifying objects and terminology: "DataOps make Learners"

Adding more comments from discussing with others Concerning DataOp vs DataOps: the name of a singular operation should be "Data Op", but as a collective name for the feature "DataOps"...

Clarifying objects and terminology: "DataOps make Learners"

Putting a pin in this: it's also possible to reuse the "pipeline"/"learner" simply for preprocessing, which might still be interesting for someone that wants to repeat the same transformations on...

Improve API page: inspiration from scikit-learn

With the addition of the expressions, the list of functions is very crowded and would definitely benefit from grouping up related objects

Improve UX of `.skb.subsample` when X and y are separate

> I know that it's possible to sample the first 1000 values, but given that there can also be random sampling we might want to add a seed to be...

Improve UX of `.skb.subsample` when X and y are separate

I'll work on the docstring

Improve UX of `.skb.subsample` when X and y are separate

Talking with @Vincent-Maladiere, we thought of improving the error message by detecting when one of X and y is sampled, but the other isn't. This should be done in .skb.apply...

Improve UX of `.skb.subsample` when X and y are separate

> To improve the documentation, I think there may be a misunderstanding that should be addressed early on in the introduction of `subsample`. why did you expect that taking a...

Improve UX of `.skb.subsample` when X and y are separate

> > detecting when one of X and y is sampled, but the other isn't > > I'm not sure this would be easy to detect reliably, as subsampling can...