GraffiTE icon indicating copy to clipboard operation
GraffiTE copied to clipboard

[E::parse_cigar] CIGAR length too long at position 1 (274808464H)

Open davidaray opened this issue 10 months ago • 13 comments

Me again.

Managed to get the software to run but it only ran for four minutes before hitting this CIGAR error.

A quick google search suggests that the read lengths are too long to handle (https://github.com/samtools/samtools/issues/1667).

However, I'm not dealing with reads, this is a job where I'm analyzing whole assemblies. I'm sure this is a mistake on my part somewhere given that the documentation specifically says graffiTE can be run using whole assemblies.

My command line:

nextflow run https://github.com/cgroza/GraffiTE \
   --assemblies cTho_assemblies.csv \
   --TE_library mammals.plus.covid_bats2.14072022.fa \
   --reference ../assemblies/cTho_A.fa \
   --graph_method pangenie \
   --genotype false \
   --cores 12 \
   --mammal \
   --svim_asm_threads 12 \
   --asm_divergence asm5
   --svim_asm_time 2h    

The error:


[-        ] process > svim_asm       -
[-        ] process > survivor_merge -
[-        ] process > repeatmask_VCF -
[-        ] process > tsd_prep       -
[-        ] process > tsd_search     -
[-        ] process > tsd_report     -

executor >  local (1)
[11/563cf2] process > svim_asm (1)   [  0%] 0 of 2
[-        ] process > survivor_merge -
[-        ] process > repeatmask_VCF -
[-        ] process > tsd_prep       -
[-        ] process > tsd_search     -
[-        ] process > tsd_report     -

executor >  local (2)
[12/90fee5] process > svim_asm (2)   [  0%] 0 of 2
[-        ] process > survivor_merge -
[-        ] process > repeatmask_VCF -
[-        ] process > tsd_prep       -
[-        ] process > tsd_search     -
[-        ] process > tsd_report     -
ERROR ~ Error executing process > 'svim_asm (1)'

Caused by:
  Process `svim_asm (1)` terminated with an error exit status (1)

Command executed:

  mkdir asm
  minimap2 -a -x asm5 --cs -r2k -t 12 -K 500M cTho_A.fa cTho_B.fa | samtools sort -m4G -@4 -o asm/asm.sorted.bam -
  samtools index asm/asm.sorted.bam
  svim-asm haploid --min_sv_size 100 --types INS,DEL --sample cTho_B asm/ asm/asm.sorted.bam cTho_A.fa
  sed 's/svim_asm\./cTho_B\.svim_asm\./g' asm/variants.vcf > cTho_B.vcf

Command exit status:
  1

Command output:
  (empty)

Command error:
  [M::mm_idx_gen::31.011*1.36] collected minimizers
  [M::mm_idx_gen::37.874*1.71] sorted minimizers
  [M::main::37.874*1.71] loaded/built the index for 812 target sequence(s)
  [M::mm_mapopt_update::40.498*1.66] mid_occ = 168
  [M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 812
  [M::mm_idx_stat::42.456*1.63] distinct minimizers: 161244371 (94.15% are singletons); average occurrences: 1.421; average spacing: 9.923; total length: 2273669687
  [E::parse_cigar] CIGAR length too long at position 1 (274808464H)
  [E::parse_cigar] CIGAR length too long at position 877 (272289946H)
  [E::parse_cigar] CIGAR length too long at position 4012 (275636627H)
  samtools sort: truncated file. Aborting

Any insight would be appreciated.

David

davidaray avatar Sep 21 '23 14:09 davidaray