Gemma icon indicating copy to clipboard operation
Gemma copied to clipboard

Better provision for submitter-supplied batch information (avoid overwrite by Gemma-generated)

Open ppavlidis opened this issue 1 year ago • 0 comments

It is extremely rare (<5 cases I've seen out of 20,000?) that batch information is explicitly provided in a GEO submission. When it is available, we should (probably) use it instead of inferring batches ourselves.

We aren't set up to handle this very well. While it is no problem to assign the batches, there is a risk that fillBatchInformation will be run after that and it would clobber the submitter-supplied batch factor with our inferred batches. This is because the BatchInformationFetchingEvent or BatchInformationEvent wouldn't have been created.

We may need something like a ProvidedBatchInformationEvent that curators would have to add (directly or indirectly) when using submitter-provided information to form batches.

This is so rare it's not important to do now, but we have discussed with GEO whether they can start asking submitters to include it.

FWIW GSE189788 is the example that arose recently, but it seems that the batches we created follow what the submitter intended but the submitter may have made some mistakes with their batches, ref. this slack discussion. So we're using our batch information anyway.

ppavlidis avatar Sep 09 '24 17:09 ppavlidis