delira
delira copied to clipboard
remove timestamp from save path
To resolve issue #197
I think, that is a problem with the current design of the experiment.resume function which uses a separate save_path as its argument. Maybe it would be more suitable to design the resume function like a static method which also restores the original experiment settings (so the arguments passed to the __init__). Obviously, it would be necessary to save those settings inside the experiment.
I thought the increasing integer value would be a nice way to decrease the length of the experiment names 🙈
I would prefer a version, that ensures unique names, if a path already exists. I usually have one naming convention and rely on the timestamp part for not overwriting previous checkpoints. My proposal: We add a flag to the experiment (called automatic_resume or something like this), which defaults to True and indicates whether to resume or create a unique path
This flag is what i had in mind too but we will have to adjust some things for this as pickle dumps all information about the expiremt setup at the end of training right? It might be better so dump the configuration first and then train the network. So that we just have to search for the latest checkpoint.
Searching for the latest checkpoint is already handled by the trainer
Format to use for the stamp
sequence number_year_month_day
If the run should be continued if present the first sequence is used
This is a minor issue, but can we maybe put the sequence number at the end? This seems more intuitive to me
We briefly agreed on having it this way in the meeting today of course the other way round makes sense too. I kinda like the idea of having it this way so that a simple number sorting is recognised on first sight but either way is fine for me.
Okay, that's a valid reason too, but this way you get the first run from different days sorted together instead of a "real" development over time. What doe the others think on this? @haarburger @mibaumgartner ?
Oh i would continue numerating even if the day changes.
That's something I wouldn't like. If we create an experiment, the date of the creation should be used for all runs, but if we re-initiate an experiment, I would also choose a new date
That's something I wouldn't like. If we create an experiment, the date of the creation should be used for all runs, but if we re-initiate an experiment, I would also choose a new date
Ok to clearify if i start an experiment today it will be 00_date_today and 01_date_today if i start the same one an hour later.
If i start it tomorrow again or in a month it would be 02_date_then but only if the experiment name is still the same which i think should be changed if something else is tried at a later point.
Ah okay, I would do it like
date_today_1, date_today_2 etc. But then I would restart with date_then1.
Another problem is, that we need some management of adding zeroes. E.g. if we have already 9 successful runs (which are named in one of the ways above, doesn't matter which one exactly) and we start the 10 run, we need to move the runs before to paths with starting zeroes to keep the sorting correct (and same if re reach the 100., 1000. etc. run). This shouldn't be hard to realize, but something we need to be aware of.
EDIT: the naming convention you proposed is also okay for me, as long as we take care of the necessary zeroes
After looking at the pros and cons, I would prefer the number at the end (did not think about sorting during our meeting). Furthermore, I would like to start from 0 at each day (edit: and) zero pad to the following 001 ... 010 ... 100 ... (no padding) ... 1000. I think, most users won't run more than 1000 experiments per day (the only drawback would be a suboptimal ordering of the experiments when running more than 1000).
I'd prefer the implementation that @mibaumgartner just suggested.
@mibaumgartner after i fixed the style check the sklearn backend now crashes could you check if that is a known bug because it seems to me like it could be.
The failure of the test is caused, because in python 3.5 the ordering of a dict is not preserved. Due to this fact, we have to rewrite this test in a way that it only checks if both lists contain the same elements but does not check their order.
EDIT: I adressed this in #227
Ok i would actually suggest they should still be somewhat unique so that a k-fold continuation is possible, in which we just start the highest number again. Meaning we omit the timestamp in case we want to continue training or what do you have in mind ?
Yeah something like that. But we don't need to take care of the kfold in this case, because inside the save_path the experiment will also create a separate folder for each run.
As this has been here for a while now we may just add in training continuation should we ? But someone mentioned it is already implemented so it would be great to share the idea if there is already a fixed idea on this.