tools-devteam
tools-devteam copied to clipboard
Create a tool that will Group and Filter on multiple columns
Example usage: https://biostar.usegalaxy.org/p/14780/
Group = Use the same logic as the Group function - meaning, one output row per distinct key.
Filter: Use same logic as "Filter by column" but permit multiple field filters (+ Add filter).
Complexity: If there are ties, ideally provide some way to either break ties or to designate whether a result is actually unique or is part of a tie. Additional column at end of output noting if unique or not (where ties could then be selected out, examined, and additional filters applied). Could also have option to report "all" or report "ties". (Arbitrarily pick first occurrence, for instance, if no ties are to be reported)
The usage seems unique enough to warrant a new tool. Iterative filtering/grouping cannot achieve quite the same results, or I cannot figure out how.
This could be an enhancement to the Filter tool itself - so no new tool, instead expand the functionality of the existing tool.
Thoughts? Could be other methods to implement yet achieve the same functionality.
@jennaj have you tried the datamash tools for these kind of analysis?
As we chatted about, I haven't yet, but look forward to testing out the IUC version once available. If it can address this challenge, then we can close this ticket out and make a new issue to request that it be reviewed for inclusion at http://usegalaxy.org.
(Imho, the current tool shed version has many complementary functions to the existing data manipulation tools on Main, so the IUC version will probably be the same and even maybe a bit better!)
As soon as we get https://github.com/galaxyproject/tools-iuc/pull/353 merged I will upload it to the TS.
Great, ping me here or email when ready if you have time. I can also watch the TS(s). thanks!
Ping! I updated the MTS.
Super, will be checking it out, thanks!