ramp-workflow
ramp-workflow copied to clipboard
Backward compatibility policy
Regarding #236 but also more generally:
- What's the policy regarding backward compatibility with the ramp-kits? Any change should be compatible with the kits in ramp-kits (or any change made in rampwf should also modify the ramp-kits so that they work with the suggested change)?
Now that ramp-workflow is on PyPI, would it be a possibility to require the kits in ramp-kits to use a specific version of ramp-workflow and other dependencies? The kits not in the ramp-workflow repo are difficult to maintain, the ones in tests\kits\ are easy to maintain as part of the tests.
- What's the difference between the kits in tests/kits/ and the ones in ramp-kits but not in tests/kits/?
ramp server has only one version of the ramp-workflow package. Having the workflow async with ramp-board is already a pain. Pinning version seems hard when you have one server unless youforce they pass with one unique version but then you lose the unifying ramp-kit structure as every kit can be unique.
If you ask me I would integrate ramp-workflow into ramp-board and simplify and simplify and simplify going more towards scikit-learn API for as many things as possible (especially the scorers)
@agramfort : what would you like to have for the scorers? If the goal is to be able to use sklearn metrics directly, it would be relatively easy to have a generic score_type factory that receives an sklearn scorer as input (when initialized in problem.py), and wraps it into a ramp scorer. You would get the best of both worlds.
I don't think we could completely scrape away the functionalities we added (e.g. precision for displaying, letting lower-the-better scorers), plus it's nice to have the possibility to recode score_function that receives Prediction objects when we have complex predictions and scorers.
indeed it would work. My concern was to have a learn a new API for scoring as sklearn can be seen as a standard that would lead to a simple solution to warrantee that workflow and board can continue with a different code tree without a risk of incompatibility.
my 0.5c
@agramfort is there an automatic way to determine in skelarn what input a given scorer requires? E.g. raw y_pred like RMSE, or class indices like accuracy (computed from y_proba, returned by predict). If not, we'll need two or three different wrappers that the user would need to choose from.Any other suggestion how to deal with this?
have a look at https://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html
you have greater_is_better and needs_proba parameters
OK, I see. This is something that we would do in RAMP, too, to wrap sklearn scorers into a RAMP scorer. But it seems that the "user" needs to provide the information on the sklearn score (e.g. what input it requires), it cannot be determined automatically, right? I mean: there is no catalogue (dict) in sklearn where greater_is_better and needs_proba parameters can be read out, right?
it's the responsability of the scorer to call the right predict function in sklearn. A scorer takes estimator, X, y and does the right thing internally