htslib icon indicating copy to clipboard operation
htslib copied to clipboard

Getting samtools cram-to-bam error “[E::cram_get_ref] Failed to populate reference for id”?

Open claudiadast opened this issue 6 years ago • 1 comments

I am trying to convert a cram to a bam using the following samtools command set-up:

samtools view -h -b -@ 16 file.cram -o file.bam -T reference.fa 2>&1 | tee -a log.txt

However, when I try this, I get the following error:

    [E::cram_get_ref] Failed to populate reference for id 76
    [E::cram_get_ref] Failed to populate reference for id 102
    [E::cram_get_ref] Failed to populate reference for id 129
    .
    .
    .
    .
    [continues for over 20 lines like this]

I'm not sure why I am getting this error. I've made sure to use the same reference file that was used to generate the cram (I was using a different reference file before) and have also included the reference.dict file.

It's also noteworthy that, although I'm getting this error, the cram file still gets converted to a bam and downstream processes like GATK are still able to run on the bam. It's just concerning to be getting such an error message.

claudiadast avatar Feb 23 '18 00:02 claudiadast

I'll need to investigate properly, but if you're getting sequence out of it then it's likely it's reporting a failure to obtain the sequence via one method and then falling back to another (which works). This is confusing I'll admit.

Could you please let me know if you have REP_PATH or REF_CACHE environment variables set, and if so what to for REF_PATH? Are the references in that file contained anywhere in the path/cache directories (as md5sums)?

Also, do you know the tool that produced the CRAM file? It shouldn't matter of course and I'm sure all tools will be writing SQ M5 header tags, but it may help to reproduce the problem precisely.

Thanks.

jkbonfield avatar Feb 23 '18 09:02 jkbonfield