[WIP] Support collections as 'single item' in Libraries
This is a very WIP, but I wanted to get some feedback to see if this is going in the right direction or I'm completely off-road here.
What did you do?
I basically tried to simulate what the Library API is doing right now with simple datasets, so I:
- Created a new
LibraryDatasetCollectionmodel to act as a proxy of an existing LDCA. - Modified the HDCAManager and HDCASerializer to implement some required methods that were missing.
- Created a new
_copy_hdca_to_library_folder_singlemethod in the controller that will add a newLibraryDatasetCollectionto the database and return the basic information:{ "id": "f2db41e1fa331b3e", "name": "ABCD Collection", "deleted": false, "can_manage": true, "create_time": "2021-04-08 11:52 AM", "update_time": "2021-04-08 11:52 AM", "type": "collection", "element_count": 4, "collection_type": "list" } - The index action for the folder_contents API now returns in the
folder_contentslist three types of items, folders, datasets, and collections. - I couldn't help to add pydantic models for the
FolderContentsServiceinputs and outputs, this probably should have been part of the API modernization in a different PR but they usually help me better understand what is being returned by the API. Anyway, now it will be really easy to just add the FastAPI endpoints :) - Updated and added some more cases to the API tests for collection items.
Why did you make this change?
This is related to https://github.com/galaxyproject/galaxy/issues/11250 and somewhat a follow-up on https://github.com/galaxyproject/galaxy/pull/3559 but trying to add the collection as a single library item that will act as a special kind of folder, instead of importing all the contained datasets as individual items.
Open questions / Feedback required
- I am not dealing with the permissions for collections right now see this. What will be the best way to deal with them? Do I need to create something similar to
LibraryDatasetPermissionsand also for associations? -
I have replaced the previous call in the API to add a new HDCA by unpacking the datasets with the one that returns the LDCA but I didn't remove the previous implementation. Maybe is a good idea to add an optional bool parameter to CreateLibraryItemPayload that will call the old way of importing collections by default so there is no change for existing calls to the API that expect a list of datasets as return value.Easier to do than to explain :) see https://github.com/galaxyproject/galaxy/pull/11806/commits/109264b8f47c3328e9c45132219140b109006fbf - Any other information worth including in CollectionItem?
- How to deal with the actual contents of the collection? Treat the library collection item like a read-only folder? What about the different types of collections?
How to test the changes?
- [x] I've included appropriate automated tests.
- [x] Instructions for manual testing are as follows:
- Go to Data Libraries
- From inside a folder, choose
Datasets -> Import form History - Select a collection from your history
- The library folder will contain a new collection item instead of all the unpacked datasets.
License
- [x] I agree to license these contributions under Galaxy's current license.
- [x] I agree to allow the Galaxy committers to license these and all my past contributions to the core galaxy codebase under the MIT license. If this condition is an issue, uncheck and just let us know why with an e-mail to [email protected].
For UI Components
Here is a preview of what is showing now without any changes in the client:

LDCAs were only ever sketched out - not really implemented so this is really interesting and I had no idea it was coming.
You said:
Created a new LibraryDatasetCollection model to act as a proxy of an existing LDCA.
Is there a reason not to just use the LDCAs right in the library? What metadata belongs on LDCs and what metadata belongs on LDCA in your mind?
This is fascinating and exciting and a bit terrifying.
So nothing in the models really ensures that LDCAs only contain library datasets and HDCAs only contain history datasets and in fact the first issue I was given said they should be interchangeable that way - I've really come to realize that is not a viable user experience though. I would suggest implementing checks and such to make sure library collections only contain library datasets and history collections only contain HDAs going forward.
Thanks, @jmchilton that is exactly the kind of feedback I need. I am trying to learn more about HDAs, HDCAs, LDDAs, LDCAs... right now everything still sounds a bit like a tongue-twister in my head 😆
LDCAs were only ever sketched out - not really implemented so this is really interesting and I had no idea it was coming.
That may explain why I had difficulties trying to build the main picture... 😅
Is there a reason not to just use the LDCAs right in the library? What metadata belongs on LDCs and what metadata belongs on LDCA in your mind?
Honestly, I just tried to get something half-working to set the ground for discussion and feedback while I try to get myself more familiar with these concepts. So I just tried to kind of mimic what is already there with LibraryDataset and LibraryDatasetDatasetAssociation but I still need to wrap my head around all the concepts.
This is fascinating and exciting and a bit terrifying.
I am stuck in the terrifying part 🤣