scaper icon indicating copy to clipboard operation
scaper copied to clipboard

ann.sandbox.scaper is sometimes a dictionary, sometimes a jams.Sandbox

Open pseeth opened this issue 6 years ago • 6 comments

#55 makes a change where paths to isolated audio for each event and the path to the generated audio is stored in ann.sandbox.scaper.soundscape_audio_path and ann.sandbox.scaper.isolated_events_audio_path. This led me to discover that ann.sandbox.scaper, which gets passed in via ann to _generate_audio is a dict when generate_from_jams calls _generate_audio, but is a jams.Sandbox when generate calls _generate_audio! My fix for this at the moment is to cast ann.sandbox.scaper to a jams.Sandbox after loading the jams object.

This raises a question - why doesn't JAMS itself load ann.sandbox.scaper as a jams Sandbox, when it is saved as a Sandbox? I tracked this down with the following minimal example using only jams:

jam = jams.JAMS()

ann = jams.Annotation(namespace='scaper')
ann.sandbox = jams.Sandbox(
    sandbox_in_sandbox=jams.Sandbox(test='test'))

jam_file = tempfile.NamedTemporaryFile(suffix='.jams', delete=True)
jam.annotations.append(ann)
print('class before saving', type(jam.annotations[0].sandbox.sandbox_in_sandbox))
jam.save(jam_file.name, strict=False)

jam = jams.load(jam_file.name, strict=False)
print('class after save and load', type(jam.annotations[0].sandbox.sandbox_in_sandbox))

Here's the output:

class before saving <class 'jams.core.Sandbox'>
class after save and load <class 'dict'>

pseeth avatar Feb 19 '20 00:02 pseeth

@bmcfee, would be helpful to get your thoughts, i think!

pseeth avatar Feb 19 '20 00:02 pseeth

I think this is correct (if awkward) behavior. Json doesn't know anything about our class hierarchy, and the only thing that knows to interpret a dict as a sandbox is our parsing logic. Nothing in the schema says that sandboxes occur anywhere outside the specified locations, so they'll stay as dicts when reloaded.

Probably the thing to do here is not store sandboxes in sandboxes.

bmcfee avatar Feb 21 '20 23:02 bmcfee

Okay cool, thanks @bmcfee. We now cast the dictionary to a sandbox in generate_from_jams(). Since _generate_audio is marked as an internal function, we can probably close this.

pseeth avatar Feb 23 '20 18:02 pseeth

@pseeth I'm curious - is there an actual need to cast ann.sandbox.scaper to jams.Sandbox? It doesn't look like it would impact generate_from_jams?

Also, I'm wondering why ann.sandbox.scaper is a jams sandbox in the first place... maybe at the time I thought it made sense for some reason. But thinking about it, is there any reason for this not to simply be a dictionary?

Here's the relevant line: https://github.com/justinsalamon/scaper/blob/583da2b0b7fa3625e035d7d3cbf721da7f60ead2/scaper/core.py#L1593

justinsalamon avatar Feb 25 '20 01:02 justinsalamon

The need cropped up in the source separation stuff, because in _generate_audio we create the actual files and then add them to the sandbox. Since _generate_audio gets called by generate_from_jams as well as generate, there was a bug when one function called it vs the other (because of the syntax of accessing a Sandbox vs accessing keys in a dictionary). I just thought that things going into the function should be of a predictable type. I picked Sandbox instead of dict so that the syntax could match the other bits where the Sandbox is used (e.g. ann.sandbox.scaper.isolated_events_audio_path.)

But yes, I don't know if it needs to be a sandbox at all.

pseeth avatar Feb 25 '20 01:02 pseeth

Since the JAMS API doesn't guarantee any checks/casting for sandboxes outside of the specified locations as noted by @bmcfee, perhaps it would be safer to update the code to stick to dicts to make sure we con't introduce future bugs downstream by assuming that the scaper sandbox in a loaded jams file is a jams.Sandbox when in fact it's just a dict?

justinsalamon avatar Feb 25 '20 01:02 justinsalamon