bunny icon indicating copy to clipboard operation
bunny copied to clipboard

secondaryFiles are not discovered?

Open buchanae opened this issue 7 years ago • 7 comments

I'm testing bunny with a TES service.

bunny + TES is failing a conformance test (#83) which includes secondaryFiles.

bunny without TES is passing that test, but curiously, when I carefully inspect the bunny logs, all of the logged data structures show secondaryFiles=[]. I suspect that the test is passing because it shares the same local filesystem.

I'm testing against this branch of bunny, which could also be a factor.

Are you aware of this behavior?

buchanae avatar Mar 15 '17 23:03 buchanae

Here is the line where it's checked that secondary files actually exist at a given location. Ideally it should be invoked after a tool finished it's job, and files are indeed present locally to the executor. It's possible that there is a bug in a way bunny invokes TES, especially since it was originally only coded to a "proof of concept" stage. We are wrapping up work on bunny on SBG integration in a next week or two, and we will shift our focus to TES once again.

StarvingMarvin avatar Mar 16 '17 14:03 StarvingMarvin

We have confirmed that this issue is not specific to the TES backend.

CWLInputPort doesn't contain any references to secondary files at all. I think the getSecondaryFiles method needs to be called somewhere in this region.

adamstruck avatar Mar 20 '17 23:03 adamstruck

To be clear, we think this is specific to loading an app's inputs, rather than evaluating files in between jobs.

buchanae avatar Mar 21 '17 00:03 buchanae

Is there an ETA on this when this will be fixed? The TES backend isn't very functional without this. Any workflow that has inputs with required secondary files will fail.

adamstruck avatar Jul 19 '17 15:07 adamstruck

Hi, sorry for the way too late response, we had some team changes and some issues on the platform that we had to address first.

Basic staging of secondary input files should be available in the latest release of bunny but I'm not sure that it will work the same on TES. Anyways, there is a bit older separate branch where it should work: "bugfix/tes" that is dealing with TES in a different way, without rabix docker image but using direct conversion and file access. Last I checked, it was passing most cwl 1.0 conformance tests when ran on funnel with maybe 10-12 failing.

In the following days, if I get the time, I will update the branch with the latest fixes and I'll contact you with the new binary and configs.

milos-ljubinkovic avatar Oct 18 '17 18:10 milos-ljubinkovic

@milos-ljubinkovic that would be great thank you!

adamstruck avatar Oct 24 '17 17:10 adamstruck

We have confirmed that this issue is not specific to the TES backend.

Just here to +1 this issue. We've been exploring rabix bunny among other implementations that support parallelism. But our workflows use secondaryFiles to for things like index files on a reference genome:

https://github.com/Duke-GCB/bespin-cwl/blob/52457d925f9c232394c38d6ccb890cced9f93e2c/workflows/exomeseq.cwl#L21-L30

so this issue stops our workflow pretty quickly.

dleehr avatar Dec 19 '17 15:12 dleehr