Gemma icon indicating copy to clipboard operation
Gemma copied to clipboard

Failed to find date when extracting Affy metadata for GSE202043

Open arteymix opened this issue 8 months ago • 1 comments

fillBatchInfo failed: IllegalStateException: Failed to find date
ubic.gemma.core.analysis.preprocess.batcheffects.BatchInfoPopulationException: Failed to pre-process GSE202043: IllegalStateException: Failed to find date
        at ubic.gemma.core.analysis.preprocess.batcheffects.BatchInfoPopulationServiceImpl.fillBatchInformation(BatchInfoPopulationServiceImpl.java:126) ~[gemma-core-1.32.0-SNAPSHOT.jar:?]
        ... suppressed 20 lines
        at ubic.gemma.apps.BatchEffectPopulationCli.processExpressionExperiment(BatchEffectPopulationCli.java:56) ~[gemma-cli-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.apps.ExpressionExperimentManipulatingCLI.processBioAssaySet(ExpressionExperimentManipulatingCLI.java:386) ~[gemma-cli-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.apps.ExpressionExperimentManipulatingCLI.doAuthenticatedWork(ExpressionExperimentManipulatingCLI.java:339) ~[gemma-cli-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.cli.util.AbstractAuthenticatedCLI.doWork(AbstractAuthenticatedCLI.java:105) ~[gemma-cli-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.cli.util.AbstractCLI.work(AbstractCLI.java:412) ~[gemma-cli-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.cli.util.AbstractCLI.executeCommandWithCliContext(AbstractCLI.java:222) [gemma-cli-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.cli.util.AbstractCLI.executeCommand(AbstractCLI.java:177) [gemma-cli-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.cli.main.GemmaCLI.main(GemmaCLI.java:364) [gemma-cli-1.32.0-SNAPSHOT.jar:?]
Caused by: java.lang.IllegalStateException: Failed to find date
        at ubic.gemma.core.analysis.preprocess.batcheffects.AffyScanDateExtractor.extract(AffyScanDateExtractor.java:145) ~[gemma-core-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.core.analysis.preprocess.batcheffects.BatchInfoParser.getBatchInformationFromFiles(BatchInfoParser.java:134) ~[gemma-core-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.core.analysis.preprocess.batcheffects.BatchInfoParser.getBatchInfo(BatchInfoParser.java:99) ~[gemma-core-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.core.analysis.preprocess.batcheffects.BatchInfoPopulationServiceImpl.getBatchDataFromRawFiles(BatchInfoPopulationServiceImpl.java:235) ~[gemma-core-1.32.0-SNAPSHOT.jar:?]
        at ubic.gemma.core.analysis.preprocess.batcheffects.BatchInfoPopulationServiceImpl.fillBatchInformation(BatchInfoPopulationServiceImpl.java:116) ~[gemma-core-1.32.0-SNAPSHOT.jar:?]
        ... 28 more

arteymix avatar May 06 '25 20:05 arteymix

I doubt this is a bug per se. I would handle it as a curator issue first (and it is a cancer data set)

There can be corrupt CEL files, it happens not infrequently. If that isn't the problem, and the files are parseable otherwise, it's just some quirky version of the CEL format and we simply won't have batch information for this data set.

(While the data set was submitted to GEO in 2022, the publication was in 2011, so the data is probably even older than that)

ppavlidis avatar May 07 '25 14:05 ppavlidis