sos icon indicating copy to clipboard operation
sos copied to clipboard

Support to DNAnexus applet

Open gaow opened this issue 3 years ago • 3 comments

@BoPeng we are recently looking into UKB data for some traits we analyze,

https://dnanexus.gitbook.io/uk-biobank-rap/science-corner/guide-to-analyzing-large-sample-sets

the new UKB release is only available through this platform (!!). As you can see, the system is based a DNAnexus implementation of WDL along with its job manager on the DNAnexus applet. Our pipelines were written in SoS that distributes the jobs with PBS templates etc. It does not seem obvious that SoS can run with DNAnexus applet, which is perhaps the most popular (if not the only) cloud platform for WDL. What's your take on this, or suggestions for SoS users in this setup?

gaow avatar Nov 16 '21 21:11 gaow

the new UKB release is only available through this platform

Curious as why this is the case since we are interested in running UKB data on DNAnexus as well.

BoPeng avatar Nov 16 '21 22:11 BoPeng

There are two possibilities. The first is to run sos scripts entirely on the platform by specifying python sos etc as dependencies. We will not be able to use our workflow features to process multiple files on multiple nodes. The second one can be more inline with the spirit of sos, namely wrapping scripts to be executed on dnanexus, with files already on their. The applet would have to be compiled and uploaded but not hugely different as how we handle the building of docker images and use them to process input files. This should work reasonable well for simple commands (eg bash scripts).

So this likes essentially like a sos-dnanexus module that works as a task (easier) and workflow engine (more difficult) that calls the dx command to do a lot of things.

BoPeng avatar Jan 29 '22 17:01 BoPeng

Also building and uploading docker images could be a more general solution.

https://youtu.be/A_iki_50Ig0

BoPeng avatar Jan 29 '22 17:01 BoPeng