ramp-workflow icon indicating copy to clipboard operation
ramp-workflow copied to clipboard

Collecting feature requests around a developmental feature for RAMP

Open kegl opened this issue 3 years ago • 24 comments

When RAMP is used for developing models for a problem, we may want to tag certain versions of a submission, and even problem.py, together with the scores. One idea is to use git tags. For example, after running ramp-test ... --save-output, one could run another script that git adds problem.py, the submission files, and the scores in training_output/fold_<i>, commit and tag with a user-defined tag (plus maybe a prefix indicating that it is a scoring tag, so later we may automatically search for all such tags).

kegl avatar Oct 16 '20 17:10 kegl

  1. When loading the data in ramp, seems training data will be read twice. When the data is big, it is a bit slow.
  2. Is it possible to parallelize the CV process?

zhangJianfeng avatar Nov 09 '20 10:11 zhangJianfeng

Adding on feature that I would be useful, at least to me: it would be great to have the ability to import more code from elsewhere in a submission, allowing multiple submissions to share some code. Now it can be done by creating a library and importing it, which is a bit tedious. @albertcthomas mentioned this could perhaps be done on a similar way that pytest does it. They have a conftest.py file for code that you want to reuse for different test module. #181

gabriel-hurtado avatar Nov 09 '20 16:11 gabriel-hurtado

@albertcthomas mentioned this could perhaps be done on a similar way that pytest does it. They have a conftest.py file for code that you want to reuse for different test module.

Well it is more like "this makes me think of conftest.py that can be used to share fixtures" but I don't know what happens when you run pytest and I am not sure the comparison goes very far :). As written in pytest doc "The next example puts the fixture function into a separate conftest.py file so that tests from multiple test modules in the directory can access the fixture function". This feature is discussed in issue #181.

albertcthomas avatar Nov 09 '20 18:11 albertcthomas

1- I find the step of reading data is taking too much time: slower than reading it without RAMP. 2- It would be great if also the mean result is saved with the bagged one. 3- Propose a latex syntax for the results. 4- When the output is saved, it would be better to save also the experiment conditions: like data label, tested hyperparameter, etc and keep all somewhere either locally or in the cloud to check it later.

illyyne avatar Nov 12 '20 10:11 illyyne

Here are some features that could help:

  • Model selection: "early killing" (e.g. successive Halving or even simpler schemes), which implies having shared information while hyperopt or at least a way to compare or current model to the best one so far (either a global python variable or save it somehow on HDD...)
  • Experimental protocol: Having parametrized problem.py, I'm keen on json (that could also be saved each time you launch a ramp-test). I'm not a big fan of using commit tags since I can launch 10 different batches of experiments on different servers without wanting to commit each time just for a experiment's configuration file.
  • Logging:
    • Model saving and loading (path, hyperopt, ...)
    • Possibility to rename the output score folders. E.g. depending on the task and the models I've implemented I rather save the results with a different directory hierarchy, let's say w.r.t. hp or more global options. It helps regex search (useful with tensorboard for example), or plotting when dealing with tens of thousands ran experiments (and looking at parameters sensitivity).
  • Other:
    • Being able to modify the submissions while some experiments are running (looks like the .py submission file is loaded several times, I have the habit to load the class somewhere which allows me to do whatever I want while my experiments are running)
    • Same as Gabriel, ease the imports in a submission, maybe I didn't find the right way to do so but there is a lot of duplicated code in my submissions even if I've implemented a Pytorch2RAMP class. #181

LudoHackathon avatar Nov 19 '20 10:11 LudoHackathon

From my (little) experience with RAMP, what made people a bit reluctant to use it was that it was too high level. Mearning that we don't see the classical sequential process we are used to see in a ML script (load data, instantiate model, train it, test it). As an example, Keras (not the same purpose as RAMP) embedded some part of the script to minimize the main script but kept the overall spirit of the classical script making it as understandable as the original one. Using ramp-test in command line may make RAMP more obscure to new users. Maybe that having a small script (as the one already in the documentation for example) giving the user a more pythonic way to play with it, without having to use ramp-test as a command line, could make machine learners more willing to use it.

LudoHackathon avatar Nov 23 '20 10:11 LudoHackathon

I have heard this many times too. Debugging is a pain etc. To fix this now I stick to RAMP kits where you need to return a sklearn estimator that implements fit and predict so you can replace ramp-test by sklearn cross_val_score and just use your favorite env to inspect / debug / run (vscode, notebook, google colab etc.)

agramfort avatar Nov 23 '20 10:11 agramfort

Calling ramp-test from a notebook is as simple as

from rampwf.utils import assert_submission
assert_submission(submission='starting_kit')

This page https://paris-saclay-cds.github.io/ramp-docs/ramp-workflow/advanced/scoring.html now contains two code snippets that you can use to call lower-level elements of the workflow and emulate a simple train/test and cross-validation loop. @LudoHackathon do you have a suggestion what else would be useful? E.g. an example notebook in the library?

kegl avatar Nov 23 '20 15:11 kegl

the doc says:

trained_workflow = problem.workflow.train_submission( 'submissions/starting_kit', X_train, y_train)

after all these years I did not know this :'(

this should be explained in the kits to save some pain to students

agramfort avatar Nov 23 '20 17:11 agramfort

this should be explained in the kits to save some pain to students

wasn't this the purpose of the "Working in the notebook" section of the old titanic notebook starting kit?

albertcthomas avatar Nov 23 '20 18:11 albertcthomas

Yes, @albertcthomas is right, but the snippet in the doc is cleaner now. I'm doing this decomposition in every kit now, see for example line 36 here https://github.com/ramp-kits/optical_network_modelling/blob/master/optical_network_modelling_starting_kit.ipynb. This snippet is even simpler than in the doc but less general, only works when the Predictions class does nothing with the input numpy array, which is most of the time (regression and classification). Feel free to reuse.

kegl avatar Nov 23 '20 18:11 kegl

This page https://paris-saclay-cds.github.io/ramp-docs/ramp-workflow/advanced/scoring.html now contains two code snippets that you can use to call lower-level elements of the workflow and emulate a simple train/test and cross-validation loop. @LudoHackathon do you have a suggestion what else would be useful? E.g. an example notebook in the library?

The page is doing a good job at showing how you can call the different elements (and thus play with them, doing plots....)

  1. for better visibility we might clearly say that there is a command-line interface based on ramp-test and a way of calling the neededs function easily in a python script (or notebook). Of course we could add an example showing the python script interface.

  2. More importantly, maybe think of what can break when you go from one to the other interface. For instance imports from other modules located in the current working directory. This still forces us/the students to work with submission files. I think that using the "scikit-learn kits" eases the transfer of your scikit-learn estimator from your playing python script/notebook to a submission file and making sure that this works in most cases. I let @agramfort confirm this :)

  3. Instead of

from rampwf.utils import assert_submission
assert_submission(submission='starting_kit')

we could have something like

from rampwf import ramp_test
ramp_test(submission='starting_kit')

Debugging is a pain etc.

For debugging with the command line I have to say that I rely a lot on adding a breakpoint where I want to enter the debugger. However, this cannot be done post-mortem compared to using %debug in ipython or jupyter. For this we could have a --pdb or --trace flag as in pytest. But it's true that it's easier to try things and play with your models/pipelines when not using the command-line.

albertcthomas avatar Nov 23 '20 18:11 albertcthomas

use your favorite env to inspect / debug / run (vscode, notebook, google colab etc.) giving the user a more pythonic way to play with it, without having to use ramp-test as a command line

this is an important point. 2 or 3 years ago I was rarely using the command-line and I always preferred staying in a python environment. Users should be able to use their favorite tool to play with their models and we should make sure that at the end it will work when calling ramp-test in the command line.

albertcthomas avatar Nov 23 '20 18:11 albertcthomas

  1. OK
  2. no comment
  3. OK. In fact we may put in focus the python call and tell them to use the command line ramp-test as a final unit test, the same way as one would use pytest. I think the cleanest way would be to have ramp_test defined in https://github.com/paris-saclay-cds/ramp-workflow/blob/advanced/rampwf/utils/cli/testing.py and main would just call ramp_test with the exact same signature. In this way it's certain that the two calls do the same thing.
  4. I prefer not adding the command line feature if everything can be done from the python call.

kegl avatar Nov 23 '20 19:11 kegl

3\. I prefer not adding the command line feature if everything can be done from the python call.

is this for 4. and --pdb?

albertcthomas avatar Nov 23 '20 19:11 albertcthomas

doing:

import imp feature_extractor = imp.load_source( '', 'submissions/starting_kit/feature_extractor.py') fe = feature_extractor. FeatureExtractor() classifier = imp.load_source( '', 'submissions/starting_kit/classifier.py') clf = classifier.Classifier()

is to me too complex and should be avoided. We have a way suggested by @kegl based on the ramwf function.

now I agree with @albertcthomas leaving the notebooks to edit python files is a bit error prone.

what I have shown to students is to use the %%file magic to write a cell to the file on the disk.

anyway I think we should show in each notebook what is the easy way.

ramp-test command is an easy for us to know that it works on their systems but not the more agile way when they need to come up with their own solution.

agramfort avatar Nov 23 '20 20:11 agramfort

import imp feature_extractor = imp.load_source( '', 'submissions/starting_kit/feature_extractor.py') fe = feature_extractor, FeatureExtractor() classifier = imp.load_source( 'submissions/starting_kit/classifier.py') clf = classifier.Classifier() is to me too complex and should be avoided. We have a way suggested by @kegl based on the ramwf function.

I'm not sure what you mean here. We're using import_module_from_source now.

kegl avatar Nov 24 '20 12:11 kegl

I copied these lines from the titanic starting kit which is used to get student started on RAMP.

agramfort avatar Nov 24 '20 12:11 agramfort

3\. I prefer not adding the command line feature if everything can be done from the python call.

is this for 4. and --pdb?

yes

kegl avatar Nov 24 '20 17:11 kegl

Another feature that would be nice to have : have an option to separate what is saved and what is printed to the console. This would allow to save extensive metrics without flooding the terminal.

gabriel-hurtado avatar Dec 16 '20 17:12 gabriel-hurtado

Partial fit for models where eg. number of trees or number of epochs is a hyper. This would be mainly a feature used by hyperopt (killing trainings early) but maybe also useful as CLI param.

kegl avatar Jan 28 '21 16:01 kegl

Standardized latex tables computed out of saved scores. Probably two steps: first create all scores (of selected submissions and data labels) into a well-designed pandas table. Then a set of tools to create latex tables, scores with CI and also paired tests. I especially like the plots and score presentation in https://link.springer.com/article/10.1007/s10994-018-5724-2.

kegl avatar Feb 04 '21 14:02 kegl

When RAMP is used for developing models for a problem, we may want to tag certain versions of a submission, and even problem.py, together with the scores. One idea is to use git tags. For example, after running ramp-test ... --save-output, one could run another script that git adds problem.py, the submission files, and the scores in training_output/fold_<i>, commit and tag with a user-defined tag (plus maybe a prefix indicating that it is a scoring tag, so later we may automatically search for all such tags).

would be great to have a look at MLflow, @agramfort pointed it out to me. There are some parts that we could use, for instance the tracking one

albertcthomas avatar Feb 26 '21 16:02 albertcthomas

  1. When loading the data in ramp, seems training data will be read twice. When the data is big, it is a bit slow.
  2. Is it possible to parallelize the CV process?
  1. yes, training data is read twice for the moment since X_train, y_train, X_test, y_test = assert_data( ramp_kit_dir, ramp_data_dir, data_label) is called twice in the testing.py module. Same issue appears with the 'problem' variable, which is called 5 times.
    It is possible to fix these issues by making the testing module object oriented, then attributes corresponding to each of theses variables, (X_train, X_test,...) could be created and we would not need to repeat calls for some functions. But do we agree to add more object oriented code ?

  2. yes it is

martin1tab avatar Mar 11 '21 18:03 martin1tab