ramp-workflow
ramp-workflow copied to clipboard
Collecting feature requests around a developmental feature for RAMP
When RAMP is used for developing models for a problem, we may want to tag certain versions of a submission, and even problem.py
, together with the scores. One idea is to use git tags. For example, after running ramp-test ... --save-output
, one could run another script that git adds problem.py, the submission files, and the scores in training_output/fold_<i>
, commit and tag with a user-defined tag (plus maybe a prefix indicating that it is a scoring tag, so later we may automatically search for all such tags).
- When loading the data in ramp, seems training data will be read twice. When the data is big, it is a bit slow.
- Is it possible to parallelize the CV process?
Adding on feature that I would be useful, at least to me: it would be great to have the ability to import more code from elsewhere in a submission, allowing multiple submissions to share some code. Now it can be done by creating a library and importing it, which is a bit tedious. @albertcthomas mentioned this could perhaps be done on a similar way that pytest does it. They have a conftest.py file for code that you want to reuse for different test module. #181
@albertcthomas mentioned this could perhaps be done on a similar way that pytest does it. They have a conftest.py file for code that you want to reuse for different test module.
Well it is more like "this makes me think of conftest.py
that can be used to share fixtures" but I don't know what happens when you run pytest
and I am not sure the comparison goes very far :). As written in pytest doc "The next example puts the fixture function into a separate conftest.py file so that tests from multiple test modules in the directory can access the fixture function".
This feature is discussed in issue #181.
1- I find the step of reading data is taking too much time: slower than reading it without RAMP. 2- It would be great if also the mean result is saved with the bagged one. 3- Propose a latex syntax for the results. 4- When the output is saved, it would be better to save also the experiment conditions: like data label, tested hyperparameter, etc and keep all somewhere either locally or in the cloud to check it later.
Here are some features that could help:
- Model selection: "early killing" (e.g. successive Halving or even simpler schemes), which implies having shared information while hyperopt or at least a way to compare or current model to the best one so far (either a global python variable or save it somehow on HDD...)
- Experimental protocol: Having parametrized problem.py, I'm keen on json (that could also be saved each time you launch a ramp-test). I'm not a big fan of using commit tags since I can launch 10 different batches of experiments on different servers without wanting to commit each time just for a experiment's configuration file.
- Logging:
- Model saving and loading (path, hyperopt, ...)
- Possibility to rename the output score folders. E.g. depending on the task and the models I've implemented I rather save the results with a different directory hierarchy, let's say w.r.t. hp or more global options. It helps regex search (useful with tensorboard for example), or plotting when dealing with tens of thousands ran experiments (and looking at parameters sensitivity).
- Other:
- Being able to modify the submissions while some experiments are running (looks like the .py submission file is loaded several times, I have the habit to load the class somewhere which allows me to do whatever I want while my experiments are running)
- Same as Gabriel, ease the imports in a submission, maybe I didn't find the right way to do so but there is a lot of duplicated code in my submissions even if I've implemented a Pytorch2RAMP class. #181
From my (little) experience with RAMP, what made people a bit reluctant to use it was that it was too high level. Mearning that we don't see the classical sequential process we are used to see in a ML script (load data, instantiate model, train it, test it). As an example, Keras (not the same purpose as RAMP) embedded some part of the script to minimize the main script but kept the overall spirit of the classical script making it as understandable as the original one. Using ramp-test in command line may make RAMP more obscure to new users. Maybe that having a small script (as the one already in the documentation for example) giving the user a more pythonic way to play with it, without having to use ramp-test as a command line, could make machine learners more willing to use it.
I have heard this many times too. Debugging is a pain etc. To fix this now I stick to RAMP kits where you need to return a sklearn estimator that implements fit and predict so you can replace ramp-test by sklearn cross_val_score and just use your favorite env to inspect / debug / run (vscode, notebook, google colab etc.)
Calling ramp-test
from a notebook is as simple as
from rampwf.utils import assert_submission
assert_submission(submission='starting_kit')
This page https://paris-saclay-cds.github.io/ramp-docs/ramp-workflow/advanced/scoring.html now contains two code snippets that you can use to call lower-level elements of the workflow and emulate a simple train/test and cross-validation loop. @LudoHackathon do you have a suggestion what else would be useful? E.g. an example notebook in the library?
the doc says:
trained_workflow = problem.workflow.train_submission( 'submissions/starting_kit', X_train, y_train)
after all these years I did not know this :'(
this should be explained in the kits to save some pain to students
this should be explained in the kits to save some pain to students
wasn't this the purpose of the "Working in the notebook" section of the old titanic notebook starting kit?
Yes, @albertcthomas is right, but the snippet in the doc is cleaner now. I'm doing this decomposition in every kit now, see for example line 36 here https://github.com/ramp-kits/optical_network_modelling/blob/master/optical_network_modelling_starting_kit.ipynb. This snippet is even simpler than in the doc but less general, only works when the Predictions class does nothing with the input numpy array, which is most of the time (regression and classification). Feel free to reuse.
This page https://paris-saclay-cds.github.io/ramp-docs/ramp-workflow/advanced/scoring.html now contains two code snippets that you can use to call lower-level elements of the workflow and emulate a simple train/test and cross-validation loop. @LudoHackathon do you have a suggestion what else would be useful? E.g. an example notebook in the library?
The page is doing a good job at showing how you can call the different elements (and thus play with them, doing plots....)
-
for better visibility we might clearly say that there is a command-line interface based on
ramp-test
and a way of calling the neededs function easily in a python script (or notebook). Of course we could add an example showing the python script interface. -
More importantly, maybe think of what can break when you go from one to the other interface. For instance imports from other modules located in the current working directory. This still forces us/the students to work with submission files. I think that using the "scikit-learn kits" eases the transfer of your scikit-learn estimator from your playing python script/notebook to a submission file and making sure that this works in most cases. I let @agramfort confirm this :)
-
Instead of
from rampwf.utils import assert_submission
assert_submission(submission='starting_kit')
we could have something like
from rampwf import ramp_test
ramp_test(submission='starting_kit')
Debugging is a pain etc.
For debugging with the command line I have to say that I rely a lot on adding a breakpoint
where I want to enter the debugger. However, this cannot be done post-mortem compared to using %debug
in ipython or jupyter. For this we could have a --pdb
or --trace
flag as in pytest
. But it's true that it's easier to try things and play with your models/pipelines when not using the command-line.
use your favorite env to inspect / debug / run (vscode, notebook, google colab etc.) giving the user a more pythonic way to play with it, without having to use ramp-test as a command line
this is an important point. 2 or 3 years ago I was rarely using the command-line and I always preferred staying in a python environment. Users should be able to use their favorite tool to play with their models and we should make sure that at the end it will work when calling ramp-test
in the command line.
- OK
- no comment
- OK. In fact we may put in focus the python call and tell them to use the command line
ramp-test
as a final unit test, the same way as one would use pytest. I think the cleanest way would be to haveramp_test
defined in https://github.com/paris-saclay-cds/ramp-workflow/blob/advanced/rampwf/utils/cli/testing.py and main would just callramp_test
with the exact same signature. In this way it's certain that the two calls do the same thing. - I prefer not adding the command line feature if everything can be done from the python call.
3\. I prefer not adding the command line feature if everything can be done from the python call.
is this for 4. and --pdb
?
doing:
import imp feature_extractor = imp.load_source( '', 'submissions/starting_kit/feature_extractor.py') fe = feature_extractor. FeatureExtractor() classifier = imp.load_source( '', 'submissions/starting_kit/classifier.py') clf = classifier.Classifier()
is to me too complex and should be avoided. We have a way suggested by @kegl based on the ramwf function.
now I agree with @albertcthomas leaving the notebooks to edit python files is a bit error prone.
what I have shown to students is to use the %%file magic to write a cell to the file on the disk.
anyway I think we should show in each notebook what is the easy way.
ramp-test command is an easy for us to know that it works on their systems but not the more agile way when they need to come up with their own solution.
import imp feature_extractor = imp.load_source( '', 'submissions/starting_kit/feature_extractor.py') fe = feature_extractor, FeatureExtractor() classifier = imp.load_source( 'submissions/starting_kit/classifier.py') clf = classifier.Classifier() is to me too complex and should be avoided. We have a way suggested by @kegl based on the ramwf function.
I'm not sure what you mean here. We're using import_module_from_source
now.
I copied these lines from the titanic starting kit which is used to get student started on RAMP.
3\. I prefer not adding the command line feature if everything can be done from the python call.
is this for 4. and
--pdb
?
yes
Another feature that would be nice to have : have an option to separate what is saved and what is printed to the console. This would allow to save extensive metrics without flooding the terminal.
Partial fit for models where eg. number of trees or number of epochs is a hyper. This would be mainly a feature used by hyperopt (killing trainings early) but maybe also useful as CLI param.
Standardized latex tables computed out of saved scores. Probably two steps: first create all scores (of selected submissions and data labels) into a well-designed pandas table. Then a set of tools to create latex tables, scores with CI and also paired tests. I especially like the plots and score presentation in https://link.springer.com/article/10.1007/s10994-018-5724-2.
When RAMP is used for developing models for a problem, we may want to tag certain versions of a submission, and even
problem.py
, together with the scores. One idea is to use git tags. For example, after runningramp-test ... --save-output
, one could run another script that git adds problem.py, the submission files, and the scores intraining_output/fold_<i>
, commit and tag with a user-defined tag (plus maybe a prefix indicating that it is a scoring tag, so later we may automatically search for all such tags).
would be great to have a look at MLflow, @agramfort pointed it out to me. There are some parts that we could use, for instance the tracking one
- When loading the data in ramp, seems training data will be read twice. When the data is big, it is a bit slow.
- Is it possible to parallelize the CV process?
-
yes, training data is read twice for the moment since X_train, y_train, X_test, y_test = assert_data( ramp_kit_dir, ramp_data_dir, data_label) is called twice in the testing.py module. Same issue appears with the 'problem' variable, which is called 5 times.
It is possible to fix these issues by making the testing module object oriented, then attributes corresponding to each of theses variables, (X_train, X_test,...) could be created and we would not need to repeat calls for some functions. But do we agree to add more object oriented code ? -
yes it is