wdlRunR icon indicating copy to clipboard operation
wdlRunR copied to clipboard

cromwellCalls and tidyr inspired results

Open vortexing opened this issue 5 years ago • 0 comments

The more I work with our cromwell and this package, the more I'm wishing it would give me tidyr-inspired results like cromwellQuery() does for the basics.

Having cromwellOutputs() give me a long form data frame of workflowIDs and output paths would be more expected behavior to me instead of getting back a nested list. With the output of cromwellQuery() I'm ready to hit the ground running on all my other processes here which is convenient so I can focus my development efforts elsewhere.

Another VERY useful thing would be to have a similar function to return metadata about the calls made in a workflow. I'd love to be able to just directly request a long form data frame (a la cromwellCalls(id, ..), that would give me the: callName, executionStatus, stdout, backendStatus, jobId, shardIndex, start, end, returnCode. Or something to that effect.

I think my main issue with interacting with Cromwell via R is the old JSON to data frame conflict. The reason why I'd like to use this R package is mainly to get all the information out of the JSON's as soon as possible and get it into long form data tables for native R users to be able to use dplyr to interact with what is happening on Cromwell. Thoughts? Perhaps there's a bigger picture here to add some options to the existing functions where if you want long form, melted/unnested data even if it's SUPER long, you can get it without having to melt/unnest it yourself.

vortexing avatar Feb 06 '19 20:02 vortexing