ann.sandbox.scaper is sometimes a dictionary, sometimes a jams.Sandbox
#55 makes a change where paths to isolated audio for each event and the path to the generated audio is stored in ann.sandbox.scaper.soundscape_audio_path and ann.sandbox.scaper.isolated_events_audio_path. This led me to discover that ann.sandbox.scaper, which gets passed in via ann to _generate_audio is a dict when generate_from_jams calls _generate_audio, but is a jams.Sandbox when generate calls _generate_audio! My fix for this at the moment is to cast ann.sandbox.scaper to a jams.Sandbox after loading the jams object.
This raises a question - why doesn't JAMS itself load ann.sandbox.scaper as a jams Sandbox, when it is saved as a Sandbox? I tracked this down with the following minimal example using only jams:
jam = jams.JAMS()
ann = jams.Annotation(namespace='scaper')
ann.sandbox = jams.Sandbox(
sandbox_in_sandbox=jams.Sandbox(test='test'))
jam_file = tempfile.NamedTemporaryFile(suffix='.jams', delete=True)
jam.annotations.append(ann)
print('class before saving', type(jam.annotations[0].sandbox.sandbox_in_sandbox))
jam.save(jam_file.name, strict=False)
jam = jams.load(jam_file.name, strict=False)
print('class after save and load', type(jam.annotations[0].sandbox.sandbox_in_sandbox))
Here's the output:
class before saving <class 'jams.core.Sandbox'>
class after save and load <class 'dict'>
@bmcfee, would be helpful to get your thoughts, i think!
I think this is correct (if awkward) behavior. Json doesn't know anything about our class hierarchy, and the only thing that knows to interpret a dict as a sandbox is our parsing logic. Nothing in the schema says that sandboxes occur anywhere outside the specified locations, so they'll stay as dicts when reloaded.
Probably the thing to do here is not store sandboxes in sandboxes.
Okay cool, thanks @bmcfee. We now cast the dictionary to a sandbox in generate_from_jams(). Since _generate_audio is marked as an internal function, we can probably close this.
@pseeth I'm curious - is there an actual need to cast ann.sandbox.scaper to jams.Sandbox? It doesn't look like it would impact generate_from_jams?
Also, I'm wondering why ann.sandbox.scaper is a jams sandbox in the first place... maybe at the time I thought it made sense for some reason. But thinking about it, is there any reason for this not to simply be a dictionary?
Here's the relevant line: https://github.com/justinsalamon/scaper/blob/583da2b0b7fa3625e035d7d3cbf721da7f60ead2/scaper/core.py#L1593
The need cropped up in the source separation stuff, because in _generate_audio we create the actual files and then add them to the sandbox. Since _generate_audio gets called by generate_from_jams as well as generate, there was a bug when one function called it vs the other (because of the syntax of accessing a Sandbox vs accessing keys in a dictionary). I just thought that things going into the function should be of a predictable type. I picked Sandbox instead of dict so that the syntax could match the other bits where the Sandbox is used (e.g. ann.sandbox.scaper.isolated_events_audio_path.)
But yes, I don't know if it needs to be a sandbox at all.
Since the JAMS API doesn't guarantee any checks/casting for sandboxes outside of the specified locations as noted by @bmcfee, perhaps it would be safer to update the code to stick to dicts to make sure we con't introduce future bugs downstream by assuming that the scaper sandbox in a loaded jams file is a jams.Sandbox when in fact it's just a dict?