sarek icon indicating copy to clipboard operation
sarek copied to clipboard

Missing file in iGenomes

Open ekushele opened this issue 1 year ago • 9 comments

Description of the bug

Hi,

I'm trying to run sarek, and I'm getting the following error:


ERROR ~ Error executing process > 'NFCORE_SAREK:SAREK:CRAM_SAMPLEQC:BAM_NGSCHECKMATE:BCFTOOLS_MPILEUP (1)'

Caused by:
  Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: DWAFHYDZD0HB3HXB; S3 Extended Request ID: 9UnG6G4ZvIrq/mijFQxFBq5uJUwvZXWga5ogNegCaF2yXEQ4UgMlcuMwhj8udwpksh2uh3JsTg0=; Proxy: null)


 -- Check '.nextflow.log' file for details
-[nf-core/sarek] Sent summary e-mail to [email protected] (mail)-
-[nf-core/sarek] Pipeline completed with errors-
WARN: Killing running tasks (2)

Command used and terminal output

$ nextflow run -resume -params-file OV3_params.yaml  -profile singularity nf-core/sarek

where params file is: 

input: OV3_sample_sheet.csv
outdir: output
step: variant_calling
wes: true
tools: 'mutect2,vep'
genome: GATK.GRCh37
save_reference: true
download_cache: true


### Relevant files

_No response_

### System information

nextflow version:  23.04.4
nf-core/sarek: 3.4.1

ekushele avatar May 08 '24 07:05 ekushele

I'm hitting the same error in mpileup.

logust79 avatar May 08 '24 12:05 logust79

That sounds like some credentials issue with AWS, Can you try to unset your AWS credentials prior running the analysis and it should be fine. Otherwise, adding this to a custom.config file, and supplying it with -c custom.config should do the trick:

aws {
    client {
        anonymous = true
    }
}

maxulysse avatar May 08 '24 12:05 maxulysse

I had the anonymous setting at the time of the error. I am going to try with GRCh38 and see what happens.

EDIT by @maxulysse removed quoted text

logust79 avatar May 08 '24 14:05 logust79

I just finished a run on 'GRCh38' and it went through without this issue.

logust79 avatar May 08 '24 16:05 logust79

@maxulysse thank you, but it didn't work.. Do you have another solution?

ekushele avatar May 09 '24 10:05 ekushele

Most likely cause for this seems some AWS credentials being picked up somehow, can you confirm that there is no .aws directory in your $HOME directory (if there is, can you try moving that aside)?

Similarly, can you confirm that you don't set any AWS-related environment variables, i.e. env | grep AWS doesn't return any lines?

If you do this in a setup involving multiple nodes you'd need to make sure it's true on the node where the actual nextflow processes are being run.

If you are running in AWS there could be some extra steps needed, but we can wait with those for now, but please let us know if that's the case.

pontus avatar May 10 '24 08:05 pontus

I ran it with and without AWS credentials, running into the same issue whatsoever. env | grep AWS confirmed nothing is set. I think some S3 files specific to GRCh37 are out of reach.

logust79 avatar May 10 '24 08:05 logust79

Can you try again?

maxulysse avatar May 10 '24 08:05 maxulysse

Can you try again?

Tried and worked without aws credentials for GRCh37. Thanks for fixing the S3 file access :)

logust79 avatar May 10 '24 09:05 logust79