CrossMap icon indicating copy to clipboard operation
CrossMap copied to clipboard

KeyError in CrossMap version 0.4.2

Open rozaimirazali opened this issue 4 years ago • 3 comments

Hello. I am getting the following KeyError message:

@ 2020-08-09 09:47:32: Read the chain file: GRCh38_to_GRCh37.chain.gz @ 2020-08-09 09:47:34: Updating contig field ... @ 2020-08-09 09:47:34: Lifting over ... Traceback (most recent call last): File "/gpfs/software/genomics/CrossMap/CrossMap-0.4.2/CrossMap-0.4.2/bin/CrossMap.py", line 2144, in crossmap_vcf_file(mapping = mapTree, infile= in_file, outfile = out_file, liftoverfile = sys.argv[2], refgenome = genome_file) File "/gpfs/software/genomics/CrossMap/CrossMap-0.4.2/CrossMap-0.4.2/bin/CrossMap.py", line 606, in crossmap_vcf_file fields[3] = refFasta.fetch(target_chr,target_start,target_end).upper() File "pysam/libcfaidx.pyx", line 303, in pysam.libcfaidx.FastaFile.fetch KeyError: "sequence 'HG989_PATCH' not present"

I am converting from GRCh38 to GRCh37. My chain file is the Ensemble file 'GRCh38_to_GRCh37.chain.gz' and my fasta file is the Ensembl GRCh37 file.

Thanks

rozaimirazali avatar Aug 09 '20 06:08 rozaimirazali

OK. This is fixed now in v0.4.4

On Sun, Aug 9, 2020 at 1:58 AM Rozaimi Razali [email protected] wrote:

Hello. I am getting the following KeyError message:

@ 2020-08-09 09:47:32: Read the chain file: GRCh38_to_GRCh37.chain.gz @ 2020-08-09 09:47:34: Updating contig field ... @ 2020-08-09 09:47:34: Lifting over ... Traceback (most recent call last): File "/gpfs/software/genomics/CrossMap/CrossMap-0.4.2/CrossMap-0.4.2/bin/CrossMap.py", line 2144, in crossmap_vcf_file(mapping = mapTree, infile= in_file, outfile = out_file, liftoverfile = sys.argv[2], refgenome = genome_file) File "/gpfs/software/genomics/CrossMap/CrossMap-0.4.2/CrossMap-0.4.2/bin/CrossMap.py", line 606, in crossmap_vcf_file fields[3] = refFasta.fetch(target_chr,target_start,target_end).upper() File "pysam/libcfaidx.pyx", line 303, in pysam.libcfaidx.FastaFile.fetch KeyError: "sequence 'HG989_PATCH' not present"

I am converting from GRCh38 to GRCh37. My chain file is the Ensemble file 'GRCh38_to_GRCh37.chain.gz' and my fasta file is the Ensembl GRCh37 file.

I thought this KeyError issue was resolved in CrossMap ver 0.3.2 ? If the contig does not exist in the target assembly, it will silently go to *.unmap file?

Thanks

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/liguowang/CrossMap/issues/18, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACN443XRJ3UVFLFQKRAMKQTR7ZCKLANCNFSM4PZBMMHA .

liguowang avatar Aug 10 '20 03:08 liguowang

Thank you for the update.

Do you where I can find the information for the following terms? I tried to search in the https://crossmap.readthedocs.io/ but could not find a definition for them.

Fail(Multiple_hits) Fail(REF==ALT) Fail(Unmap) etc...

rozaimirazali avatar Aug 10 '20 06:08 rozaimirazali

Hi: Here is the explanation. We will update the documentation soon:

  • Fail(Multiple_hits) : This genomic location was mapped to two or more locations to the target assembly.
  • Fail(REF==ALT) : After liftover, the reference allele and the alternative allele are same (i.e. this is NOT an SNP/variant after liftover)
  • Fail(Unmap) : Unable to map this genomic location to the target assembly.
  • Fail(KeyError) : Unable to find the contig ID (or chromosome ID) from the reference genome sequence (of the target assembly).

On Mon, Aug 10, 2020 at 1:22 AM Rozaimi Razali [email protected] wrote:

Thank you for the update.

Do you where I can find the information for the following terms? I tried to search in the https://crossmap.readthedocs.io/ but could not find a definition for them.

Fail(Multiple_hits) Fail(REF==ALT) Fail(Unmap) etc...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/liguowang/CrossMap/issues/18#issuecomment-671183112, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACN443XB5KB5B3RCO3Y4KDTR76GZBANCNFSM4PZBMMHA .

liguowang avatar Aug 12 '20 23:08 liguowang