ramp-workflow icon indicating copy to clipboard operation
ramp-workflow copied to clipboard

Backward compatibility policy

Open albertcthomas opened this issue 4 years ago • 7 comments

Regarding #236 but also more generally:

  1. What's the policy regarding backward compatibility with the ramp-kits? Any change should be compatible with the kits in ramp-kits (or any change made in rampwf should also modify the ramp-kits so that they work with the suggested change)?

Now that ramp-workflow is on PyPI, would it be a possibility to require the kits in ramp-kits to use a specific version of ramp-workflow and other dependencies? The kits not in the ramp-workflow repo are difficult to maintain, the ones in tests\kits\ are easy to maintain as part of the tests.

  1. What's the difference between the kits in tests/kits/ and the ones in ramp-kits but not in tests/kits/?

albertcthomas avatar Jun 24 '20 15:06 albertcthomas

ramp server has only one version of the ramp-workflow package. Having the workflow async with ramp-board is already a pain. Pinning version seems hard when you have one server unless youforce they pass with one unique version but then you lose the unifying ramp-kit structure as every kit can be unique.

If you ask me I would integrate ramp-workflow into ramp-board and simplify and simplify and simplify going more towards scikit-learn API for as many things as possible (especially the scorers)

agramfort avatar Jun 24 '20 16:06 agramfort

@agramfort : what would you like to have for the scorers? If the goal is to be able to use sklearn metrics directly, it would be relatively easy to have a generic score_type factory that receives an sklearn scorer as input (when initialized in problem.py), and wraps it into a ramp scorer. You would get the best of both worlds.

I don't think we could completely scrape away the functionalities we added (e.g. precision for displaying, letting lower-the-better scorers), plus it's nice to have the possibility to recode score_function that receives Prediction objects when we have complex predictions and scorers.

kegl avatar Jun 27 '20 08:06 kegl

indeed it would work. My concern was to have a learn a new API for scoring as sklearn can be seen as a standard that would lead to a simple solution to warrantee that workflow and board can continue with a different code tree without a risk of incompatibility.

my 0.5c

agramfort avatar Jun 29 '20 13:06 agramfort

@agramfort is there an automatic way to determine in skelarn what input a given scorer requires? E.g. raw y_pred like RMSE, or class indices like accuracy (computed from y_proba, returned by predict). If not, we'll need two or three different wrappers that the user would need to choose from.Any other suggestion how to deal with this?

kegl avatar Jul 14 '20 16:07 kegl

have a look at https://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html

you have greater_is_better and needs_proba parameters

agramfort avatar Jul 14 '20 20:07 agramfort

OK, I see. This is something that we would do in RAMP, too, to wrap sklearn scorers into a RAMP scorer. But it seems that the "user" needs to provide the information on the sklearn score (e.g. what input it requires), it cannot be determined automatically, right? I mean: there is no catalogue (dict) in sklearn where greater_is_better and needs_proba parameters can be read out, right?

kegl avatar Jul 20 '20 13:07 kegl

it's the responsability of the scorer to call the right predict function in sklearn. A scorer takes estimator, X, y and does the right thing internally

agramfort avatar Jul 21 '20 21:07 agramfort