crowsetta
crowsetta copied to clipboard
ENH: consider adding support for GUANO audio metadata parsing/writing
Describe the solution you'd like Guano is a metadata convention used by the bat acoustics community
see spec https://github.com/riggsd/guano-spec/blob/master/guano_specification.md
and python package https://pypi.org/project/guano/
It has a nice set of values in the defaults, and allows "namespaces" with custom sets of fields as well.
Implementation-wise, it writes a separate section of WAV file (only wav is supported) similar to the default header, but it can be at the end for some reason. However, it seems it also supports I/O of text files and other formats.
Thank you @sammlapp for suggesting this.
I have looked at GUANO before but hadn't added it.
Are you seeing a lot of usage of this format in the wild?
We are biased right now towards formats for annotating speech-like sequences of sounds like birdsong syllables; it seems like GUANO is at the other extreme where the goal is to provide as much relevant metadata as possible about a detection, using the term as it used in bioacoustics. (I know you know this, just adding context for anyone else who stumbles on the issue.)
It's not clear to me from the spec: is there a way to represent multiple detections within a single file?
Don't mean to grill you, I really appreciate your suggesting this -- I'm just hoping since you're an actual bioacoustician you might have more insight into how this format is being used in the wild
Some related discussion here: https://github.com/tdwg/ac/issues/264 and https://github.com/tdwg/ac/issues/247 (related in the sense that it provides context about how standards groups are thinking about GUANO)
Also ... are you aware of any publicly available datasets that use this format?
I think I looked before and couldn't find any, another reason I didn't raise an issue about it.
Just asking since it would help to test that we can actually parse / write
🤔 this at least says it was collected with Anabat (and Audiomoth): https://databank.illinois.edu/datasets/IDB-4200947
To be honest, I wouldn't prioritize this as I don't see it being used a lot and haven't had a reason to need it. I opened the feature request because there was an open feature request about Guano on OpenSoundscape, but it seems like if integration is implemented anywhere it should be in Crowsetta rather than OpenSoundscape
Got it, thank you @sammlapp. Happy to add it if you start seeing more of a need for it, looks fairly painless from the Python implementation