nextflow
nextflow copied to clipboard
Failed to sanitize XML document destined for handler class (..) ListBucketHandler
Expected behavior and actual behavior
I am running the fetchngs
pipeline using sequera platform - AWS batch - to retrieve a public datasest of about 1000 samples (~10TB data in total), but there was an error which I believe is related to nexflow / AWS SDK:
Caused by: com.amazonaws.AbortedException:
at com.amazonaws.internal.SdkFilterInputStream.abortIfNeeded(SdkFilterInputStream.java:61)
at com.amazonaws.internal.SdkFilterInputStream.read(SdkFilterInputStream.java:89)
at com.amazonaws.event.ProgressInputStream.read(ProgressInputStream.java:180)
at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:270)
at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:313)
at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:188)
at java.base/java.io.InputStreamReader.read(InputStreamReader.java:177)
at java.base/java.io.BufferedReader.read1(BufferedReader.java:211)
at java.base/java.io.BufferedReader.read(BufferedReader.java:287)
at java.base/java.io.Reader.read(Reader.java:250)
at com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser.sanitizeXmlDocument(XmlResponsesSaxParser.java:211)
... 72 common frames omitted
Steps to reproduce the problem
I can provide a rough outline of what I am did:
outdir: s3://xxx/external_projects/data
input: s3://xxx/sra_ids.csv
nf_core_pipeline: rnaseq
AWS batch, working dir the s3 bucket where the output is also going to be stored.
Program output
(Copy and paste here output produced by the failing execution. Please highlight it as a code block. Whenever possible upload the .nextflow.log
file.)
Environment
- Nextflow version: 23.10.0 build 5889
- Java version: no idea
- Operating system: AWS BATCH / seqera platform
- Bash version:
Additional context
I ran the tasks command on a local terminal - to exclude issues with the files themselves - and it completed successfully.
small subset of the nextflow log: