delira icon indicating copy to clipboard operation
delira copied to clipboard

remove timestamp from save path

Open gedoensmax opened this issue 6 years ago • 19 comments

To resolve issue #197

gedoensmax avatar Sep 06 '19 13:09 gedoensmax

I think, that is a problem with the current design of the experiment.resume function which uses a separate save_path as its argument. Maybe it would be more suitable to design the resume function like a static method which also restores the original experiment settings (so the arguments passed to the __init__). Obviously, it would be necessary to save those settings inside the experiment.

I thought the increasing integer value would be a nice way to decrease the length of the experiment names 🙈

mibaumgartner avatar Sep 06 '19 16:09 mibaumgartner

I would prefer a version, that ensures unique names, if a path already exists. I usually have one naming convention and rely on the timestamp part for not overwriting previous checkpoints. My proposal: We add a flag to the experiment (called automatic_resume or something like this), which defaults to True and indicates whether to resume or create a unique path

justusschock avatar Sep 06 '19 16:09 justusschock

This flag is what i had in mind too but we will have to adjust some things for this as pickle dumps all information about the expiremt setup at the end of training right? It might be better so dump the configuration first and then train the network. So that we just have to search for the latest checkpoint.

gedoensmax avatar Sep 06 '19 19:09 gedoensmax

Searching for the latest checkpoint is already handled by the trainer

justusschock avatar Sep 07 '19 06:09 justusschock

Format to use for the stamp

sequence number_year_month_day

If the run should be continued if present the first sequence is used

gedoensmax avatar Sep 24 '19 15:09 gedoensmax

This is a minor issue, but can we maybe put the sequence number at the end? This seems more intuitive to me

justusschock avatar Sep 24 '19 16:09 justusschock

We briefly agreed on having it this way in the meeting today of course the other way round makes sense too. I kinda like the idea of having it this way so that a simple number sorting is recognised on first sight but either way is fine for me.

gedoensmax avatar Sep 24 '19 16:09 gedoensmax

Okay, that's a valid reason too, but this way you get the first run from different days sorted together instead of a "real" development over time. What doe the others think on this? @haarburger @mibaumgartner ?

justusschock avatar Sep 24 '19 16:09 justusschock

Oh i would continue numerating even if the day changes.

gedoensmax avatar Sep 24 '19 16:09 gedoensmax

That's something I wouldn't like. If we create an experiment, the date of the creation should be used for all runs, but if we re-initiate an experiment, I would also choose a new date

justusschock avatar Sep 24 '19 16:09 justusschock

That's something I wouldn't like. If we create an experiment, the date of the creation should be used for all runs, but if we re-initiate an experiment, I would also choose a new date

Ok to clearify if i start an experiment today it will be 00_date_today and 01_date_today if i start the same one an hour later. If i start it tomorrow again or in a month it would be 02_date_then but only if the experiment name is still the same which i think should be changed if something else is tried at a later point.

gedoensmax avatar Sep 24 '19 16:09 gedoensmax

Ah okay, I would do it like

date_today_1, date_today_2 etc. But then I would restart with date_then1.

Another problem is, that we need some management of adding zeroes. E.g. if we have already 9 successful runs (which are named in one of the ways above, doesn't matter which one exactly) and we start the 10 run, we need to move the runs before to paths with starting zeroes to keep the sorting correct (and same if re reach the 100., 1000. etc. run). This shouldn't be hard to realize, but something we need to be aware of.

EDIT: the naming convention you proposed is also okay for me, as long as we take care of the necessary zeroes

justusschock avatar Sep 24 '19 16:09 justusschock

After looking at the pros and cons, I would prefer the number at the end (did not think about sorting during our meeting). Furthermore, I would like to start from 0 at each day (edit: and) zero pad to the following 001 ... 010 ... 100 ... (no padding) ... 1000. I think, most users won't run more than 1000 experiments per day (the only drawback would be a suboptimal ordering of the experiments when running more than 1000).

mibaumgartner avatar Sep 24 '19 17:09 mibaumgartner

I'd prefer the implementation that @mibaumgartner just suggested.

haarburger avatar Sep 24 '19 17:09 haarburger

@mibaumgartner after i fixed the style check the sklearn backend now crashes could you check if that is a known bug because it seems to me like it could be.

gedoensmax avatar Oct 09 '19 06:10 gedoensmax

The failure of the test is caused, because in python 3.5 the ordering of a dict is not preserved. Due to this fact, we have to rewrite this test in a way that it only checks if both lists contain the same elements but does not check their order.

EDIT: I adressed this in #227

justusschock avatar Oct 17 '19 11:10 justusschock

Ok i would actually suggest they should still be somewhat unique so that a k-fold continuation is possible, in which we just start the highest number again. Meaning we omit the timestamp in case we want to continue training or what do you have in mind ?

gedoensmax avatar Oct 25 '19 15:10 gedoensmax

Yeah something like that. But we don't need to take care of the kfold in this case, because inside the save_path the experiment will also create a separate folder for each run.

justusschock avatar Oct 25 '19 15:10 justusschock

As this has been here for a while now we may just add in training continuation should we ? But someone mentioned it is already implemented so it would be great to share the idea if there is already a fixed idea on this.

gedoensmax avatar Oct 25 '19 20:10 gedoensmax