SigMF add 'capture_tags' field to captures metadata

This is something discussed in #124 that I continue to find need of; there is currently no way to track why a segment exists that information is sufficiently captured via frequency or datetime.

Jul 30 '21 20:07 jacobagilbert

@jacobagilbert - Agreed on the simplicity of the change, but I'm not sure I'm sold on the concept, yet. Can you give me an example of how you use this field, and how it's different from an annotation's description field that is describing a chunk of samples in the same segment?

Aug 25 '21 01:08 bhilburn

The primary reason for my making the PR to add this field is to contain capture-segment-scoped information about the recording. Examples:

Flagging a portion of the data as saturated (/compression/nonlinear)
Indicating a portion of a dataset should be ignored by a post processing stage

This might be better handled by application (extension) specific fields, if you think thats the case then I can withdraw this.

Aug 25 '21 13:08 jacobagilbert

So I am changing my mind on this slightly. I actually think what is needed is a short form field for machine parsing, be that: label / category / type / capture_type etc.

Not sure where your current thoughts on this @bhilburn...

Specifically I am writing a presentation SigMF Extension (will PR it here) for use with applications such as inspectrum, and one thing you might want to do is define different presentations for different captures segments. A short form field is necessary for this, and it is not limited to presentation layer information. E.g.: one may want to sample data from a file but avoid any areas marked invalid such as around when a radio is retuning.

Oct 25 '21 17:10 jacobagilbert

@bhilburn @Teque5 dusting this off because i think its useful. the updates here are basically the "most complicated" way of doing this. I think i could get 99% of the utility i need out of this with a single string for the capture_tag.

Same motivation as before, to help automated processing (specifically monte-carlo sampling from datasets) do so most effectively, and to enable users to specify problems in datasets without actually modifying the data files and resorting to clunky things like removing parts of files.

Let me know what you think.

Mar 18 '22 14:03 jacobagilbert

I like the idea of adding tags. A few thoughts:

Are we defining a new thing called segment that is a portion of capture for the first time? I think this is okay but would it make more sense to call it capture_tags?
You have (2) extra brackets in your diff here and here.
To play devil's advocate couldn't we achieve a similar result by simply adding an annotation that covers samples X-Y? If we add this tag field people might use it to implement non-capture related tags like preamble or crc.
If something changes in the environment does that go in a tag or in the annotation? ie (inside/outside) (day/night). When exactly do you use which.

Mar 18 '22 17:03 Teque5

So I believe a "captures segment" has been a thing all along but maybe I'm using this more liberally than I should... capture_tags works too. I actually had that at first but switched for some reason. I'll change this.
Will fix.
I suppose this could be achieved in a annotation also. I suppose the thought was this was scoped at a bit higher level (all samples in this capture have these associated tags) as opposed to annotations which don't have an order (start sample) but no hierarchy.
I think the idea of adding a new capture segment with tags would mostly done in postprocessing though things like saturation could be automatically detected.

Thanks for looking at this.

Mar 18 '22 17:03 jacobagilbert

@Teque5 updated per your feedback and i explicitly included some info on overlap with annotations.

I am still unsure if this should be a single tag (much simpler, probably will handle most cases) or a list... Open to input here.

Mar 19 '22 21:03 jacobagilbert

Another use case that I came across just recently was to annotate a Bluetooth recording where the acquisition and sync phase between two BT devices is explicitly identified followed by a segment identifying active data transfer phase.

Mar 21 '22 16:03 jacobagilbert