Impose limits on Assets per DataSource
In principle we could support very large numbers of Assets per DataSource, but in practice at the moment ?include_data_sources=true lists the Assets in one list, not paginated. If we find use cases for very large number of Assets, we'll want a paginated way to list, and perhaps filter and sort them.
For now, should we limit Assets to 10k per DataSource? I think the most likely reason for us to breach that limit would be TIFF sequences, so @Wiebke and @dylanmcreynolds should weigh in on whether this limit will be a problem for present use cases at ALS.
I do know @taxe10 recently encountered a case where she worked with ~30k tiff images, but I do not recall if these were in a single sequence or spanned across multiple sequences. @taxe, can you share details about this use case?
On that note, we are discussing where the trade-offs and conceptual differences of grouping multiple images into one container are too. There is an argument to be made for only grouping a sequence of 2d-images if it actually conceptually represents a 2D+1 dataset (may that be due to 3d spatial dimensions dimensions, or 2d spatial dimensions+time) and not if it is a grouped collection of 2d-images across several experiments. At the same time ML algorithms for images may benefit from being able to randomly index into a single container (although there may be performance considerations to investigate in more detail with that too).
I should of course have tagged @taxe10 as well. (I think I opened a new tab to go check my recollection of her GH username and lost my train of thought….) Thanks @Wiebke!