sacred icon indicating copy to clipboard operation
sacred copied to clipboard

File Storage Observer, know where files are dumped during run

Open SumNeuron opened this issue 5 years ago • 7 comments

So I have read the Observing an experiment documentation up and down and the following isn't clear to me:

If I want to run an experiment and keep track of just some basic information (e.g. the config and the acc) using the File Observer, how can I know where (what subdir) it is writing to?

I set the directory where I want all the experiments to be written to, but once I call an experiment, I do not know what the subdir will be (seems to be just enumerated...)

ex.observers.append(FileStorageObserver.create(experiment_dir))

Why would I want to know this?

Well returning the weights would results in them being dumped into results of the run file. This is not efficient and a poor idea.

The documentation for custom information suggests large data separately, and using add_artifact to save it to the DB.

I do not need two copies of the file.

So it would be nice if I could, via _run, access "observer_dump_loc" so I can write my files there:

os.path.join(_run["observer_dump_loc"], "all_this_data.npz")

# writes to /experiment_dir/<observer_dump_loc>/all_this_data.npz

SumNeuron avatar Mar 01 '19 13:03 SumNeuron

Looking at my logger it seems that run has an "ID" property

INFO: :Started run with ID "1"

but looking at run.json

{
  "artifacts": [],
  "command": "main",
  "experiment": {
  ...
  },
  ...
}

"ID" is no where to be found

SumNeuron avatar Mar 01 '19 13:03 SumNeuron

The documentation for custom information suggests large data separately, and using add_artifact to save it to the DB. I do not need two copies of the file.

Thats a downside of the artifact interface and we are working on creating artifacts from in-memory objects, however, until now that is just how it works. Just write the file to your local directory and call add_artifact on that file. You can then just delete the file from your local directory.

Looking at my logger it seems that run has an "ID" property

Yes indeed. Every run gets an incremented id. These will also determine the naming scheme of directories for the filestorageobserver I guess. Run 1 will be stored in directory _1.

JarnoRFB avatar Mar 01 '19 13:03 JarnoRFB

From the source code

# lines 25-26
 self._id = None
 """The ID of this run as assigned by the first observer"""

is it safe to use _run._id then ?

e.g.

ex.observers.append(FileStorageObserver.create(experiment_dir))


@ex.automain
def main(..., _run):
    dump_data_here = os.path.join(experiment_dir, _run._id, "all_this_data.npz")


using FileObserver (as I test locally right now), this is just enumeration, 1, 2, 3, 4,... but perhaps _id updates throughout the experiment (e.g. if parallelized)

SumNeuron avatar Mar 01 '19 13:03 SumNeuron

No _id is the unique identifier for a run and should never change. However, I really recommend to use _run.add_artifact instead of manually dumping data. This way you can easily switch to another observer without changing your code.

JarnoRFB avatar Mar 01 '19 13:03 JarnoRFB

With MongoDB I agree.

For FileObserver less so.

Why?

To my current understanding of Sacred, the add_artifact methods requires a full path for the filename argument.

This means that the purpose of Sacred (managing experiments) is greatly negated as I at the very least have to make an over-writable tmp dir somewhere to dump my files out to. However, if I am running several experiments at once, I need to make several of these or manage naming etc (what sacred does with runs).

So if I am going to code all that, then why use sacred :P ( I know there are still benefits, I am just trying to emphasize a point)

Perhaps a solution could be that each run created a temp dump dir that is deleted upon completion.

There I can store / write out my files and then call add_artifact to tell Sacred which files to keep in the db.

e.g.



@ex.automain
def main(..., _run):

    # ... run stuff
    
    # _run.dump_dir is a tmp dir which is deleted upon either return of main or after all add_artifact
    # events are finished

    out_file = os.path.join(_run.dump_dir, 'my_big_file.pckl')
    # save to out_file
    _run.add_artifact(out_file)

   # continue

SumNeuron avatar Mar 01 '19 13:03 SumNeuron

Totally see your point here. A temporary directory per run sounds like a workable solution. I take it as a feature request! Maybe mkdtemp already does it. Meanwhile you can prefix all artifact files with _run._id and delete them right after you added them. This should keep the naming management to a minimum.

JarnoRFB avatar Mar 01 '19 14:03 JarnoRFB

Hi. What prevents us from allowing ndarray's and streams as add_artifact arguments? That would've sure solved the naming issue as we wouldn't have any names and also having to dump weights/etc into a file first isn't very convenient protocol. If some underlying package only accepts filenames (mongo? I'm only making a guess) then perhaps sacred could bridge the gap and transparently to user make a temp. Also, sometimes a user of sacred may indeed need a temporary when some other package he uses only provides filename interface -- so even with add_artifact supporting buffers it would be still handy to have a convenience function that makes a named temporary associated with the current run and returns the name.

SomeoneSerge avatar Apr 21 '19 09:04 SomeoneSerge