galaxy icon indicating copy to clipboard operation
galaxy copied to clipboard

Add new optional has_data_lines metadata attribute and use it to filter technically empty datasets

Open wm75 opened this issue 7 months ago • 5 comments

The need to filter "technically" empty datasets, like BAMs with only a header, but no read records, has come up in discussions more than once.

The new metadata flag here is intended to indicate whether a dataset has more than just header lines. If set, the Filter empty datasets collection operation can make use of it as demonstrated here for SAM datasets.

Before investing more time in this, I'd like to get opinions whether this looks like a reasonable way to approach the problem.

I think that the filtering functionality might better be added as a new tool instead of changing the behavior of the existing Filter empty datasets tool so feel free to comment also on this aspect.

License

  • [x] I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

wm75 avatar Jan 14 '24 21:01 wm75