galaxy icon indicating copy to clipboard operation
galaxy copied to clipboard

Bring BioCompute exports under invocation export framework

Open jmchilton opened this issue 1 year ago • 2 comments

  • Builds on RO Crate Export
  • Task-based and a similar API should make it easier to integrate into #14606
  • Adds validation previously missing using data model generation approach from DRS work ( #13949 )
  • Much more typing throughout.
  • Formal annotation of request parameters and such in OpenAPI.

WIP: needs some tests - but everything is strongly typed so the pieces should largely fit together very well.

How to test the changes?

(Select all options that apply)

License

  • [x] I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

jmchilton avatar Sep 14 '22 20:09 jmchilton

Oha, something big is coming here together :)

bgruening avatar Sep 14 '22 21:09 bgruening

Ping @HadleyKing

nsoranzo avatar Sep 14 '22 23:09 nsoranzo

Looks like some code from the populator went missing here... :thinking:

bco = self.workflow_populator.get_biocompute_object(invocation_id)
E           AttributeError: 'WorkflowPopulator' object has no attribute 'get_biocompute_object'

davelopez avatar Sep 30 '22 09:09 davelopez

@jmchilton @davelopez any chance we could take this over and hack on it during the european galaxy days cofest ? Do we need anything beyond adding tests ?

mvdbeek avatar Oct 05 '22 09:10 mvdbeek

Yes - please take it over that would be great! Hard to say what more is needed without tests - but I remember this being pretty close and I don't possess some secret knowledge about BCOs or an agenda related to it - I was just working from the existing branch and the schema so I think anyone could do this.

I might look at test_workflow_tasks as a place to put a couple tests (https://github.com/galaxyproject/galaxy/pull/14595/files#diff-9f3ad661ca1701467f48508ea79951efe90aa29a6d02dec3f2998d831e755bb4). Maybe expand the number of RO Crate tests while in there also 👼?

jmchilton avatar Oct 05 '22 09:10 jmchilton

I think the BCO part looks pretty solid now, thank you so much @jmchilton! We will add some more tests for the RO-crate part in the upcoming days along with some updates of the format during the BioHackathon :+1:

davelopez avatar Oct 07 '22 09:10 davelopez

@HadleyKing FYI, in this PR we are changing the recommended method to export a Galaxy workflow invocation as a BioCompute Object, see https://github.com/galaxyproject/galaxy/pull/14620/files#diff-52428a6a9a7185836d5f77226f948a385e542250c98b97059fb218b4cdec875a (you'll need to click on "Load diff" to expand the changes). The plan is to deprecate the 2 old API endpoints in the Galaxy release (23.1) and remove them in the following release. Does this sound OK from your side?

nsoranzo avatar Oct 12 '22 17:10 nsoranzo

@HadleyKing FYI, in this PR we are changing the recommended method to export a Galaxy workflow invocation as a BioCompute Object, see https://github.com/galaxyproject/galaxy/pull/14620/files#diff-52428a6a9a7185836d5f77226f948a385e542250c98b97059fb218b4cdec875a (you'll need to click on "Load diff" to expand the changes). The plan is to deprecate the 2 old API endpoints in the Galaxy release (23.1) and remove them in the following release. Does this sound OK from your side?

Forgive me for asking (there has been a lot of activity here and I have not been able to follow very closely YET) but are their REPLACEMENT APIs for those? There MAY be a use case in the upcoming year for one of our projects having programatic access to the biocompute. Or one suggestion is you could depreciate them and then I could do a PR later on for a more up to date API if we need it?

AFAIK that was a 1st step before I started working on the UI. My only REAL concern is that we still have the ability to download.

HadleyKing avatar Oct 14 '22 13:10 HadleyKing

The failing tests are unrelated :)

davelopez avatar Oct 14 '22 13:10 davelopez

Forgive me for asking (there has been a lot of activity here and I have not been able to follow very closely YET) but are their REPLACEMENT APIs for those? There MAY be a use case in the upcoming year for one of our projects having programatic access to the biocompute. Or one suggestion is you could depreciate them and then I could do a PR later on for a more up to date API if we need it?

Yes, there are replacement API routes. It just uses a generic endpoint for exporting workflow invocations but you can pass a parameter to specify the export format, in this case bco.json.

From the web API perspective, these are the steps:

  1. call POST api/invocations/{id}/prepare_store_download with payload:
{
    model_store_format: bco.json
}

(You can specify additional parameters for the format) 2. Get storageRequestId from response and poll GET api/short_term_storage/${storageRequestId}/ready until SUCCESS 3. Get the resulting file with api/short_term_storage/${storageRequestId}

The reason is that now we use an asynchronous way to generate the download for all invocations.

From the UI perspective, you can check https://github.com/galaxyproject/galaxy/pull/14606

My only REAL concern is that we still have the ability to download.

Sure! you can download it, just need to wait for the asynchronous export to finish before downloading if you are directly using the API or the UI will take care of that for you.

davelopez avatar Oct 14 '22 14:10 davelopez

Thanks @davelopez and @jmchilton!

bgruening avatar Oct 14 '22 19:10 bgruening