skrub icon indicating copy to clipboard operation
skrub copied to clipboard

WIP add the Recipe

Open jeromedockes opened this issue 1 year ago • 4 comments

This is still in draft mode but I'll open the PR so we can discuss the example.

I still need to add more tests and reference documentation

jeromedockes avatar Sep 09 '24 13:09 jeromedockes

the example is example 10, "using the recipe"

jeromedockes avatar Sep 09 '24 14:09 jeromedockes

Hey @jeromedockes, could you write a small TL;DR regarding the recent changes?

Vincent-Maladiere avatar Oct 08 '24 09:10 Vincent-Maladiere

yes:

  • a small change to be compatible with the current version of the tablereport (columns that match a filter must now be given by their indices not column names)
  • removing get_x_test etc as we discussed in the first round of review
  • adding (developer) documentation to the _tuning module

jeromedockes avatar Oct 08 '24 09:10 jeromedockes

Great, thanks!

Vincent-Maladiere avatar Oct 08 '24 10:10 Vincent-Maladiere

Hello all 👋

Love the package, and what you are doing here! If you don’t know me, I’m one of the developers of tidymodels.

I’m here to ask if you are set on the name recipe for this class. I see you have referred to this class by other names in other issues. The reason why I ask is that we maintain an R package called recipes which produces recipe() objects as well. And as far as I can tell, they appear to have some overlap in scope. Namely, a way to sequence a list of transformers, for feature engineering/preprocessing, with with selectors such as all_numeric() and the like. You can correct me if I’m wrong.

If they do overlap, I like to think it would be in our best interest to have disjoint names, in part to improve search results online.

Best! Emil Hvitfeldt

EmilHvitfeldt avatar Nov 20 '24 17:11 EmilHvitfeldt

Dear Emil,

The name "recipe" is not cast in stone. We are experimenting with APIs and names to make the resulting code and documentation as easy as possible to read and understand.

Note that the terminology "recipe" is also used in other projects, for instance https://ibis-project.github.io/ibis-ml/

There are only a limited number of commonly understood words within a certain scope, and it is bound that there will be terminology intersection between packages. For instance the terminology "data frame" is used across many packages. This can arguably be a good thing, to help users understand links and concepts.

Anyhow, we are still experimenting a lot with the concepts here (unfortunately not everything is visible online, forgive us), I just cannot really tell where we are going to go at this point.

Best

GaelVaroquaux avatar Nov 21 '24 08:11 GaelVaroquaux

Should we close this PR @jeromedockes?

Vincent-Maladiere avatar Feb 17 '25 08:02 Vincent-Maladiere

yeah I think so; superseded by #1233

jeromedockes avatar Feb 18 '25 09:02 jeromedockes