helmsman icon indicating copy to clipboard operation
helmsman copied to clipboard

KeyError from pyfaidx can mean that the chromosome labels in the ref fasta and vcf don't match

Open TeresaPegan opened this issue 2 years ago • 0 comments

Hi, I got this error and I figured it out, but I thought I would post this here because it requires a close look at the scripts to figure out how to interpret the error. As a low-priority suggestion, it might be worth adding a more user-friendly error for this situation.

Here is the error I initially got:

Traceback (most recent call last):
  File "/Users/teresapegan/opt/miniconda3/envs/helmsman/lib/python3.6/site-packages/pyfaidx/__init__.py", line 997, in __getitem__
    return self.records[rname]
KeyError: '1'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "helmsman/helmsman.py", line 413, in <module>
    main()
  File "helmsman/helmsman.py", line 329, in main
    data_in = util.processInput(args.mode, args, subtypes_dict)
  File "/Users/teresapegan/helmsman/util.py", line 283, in __init__
    self.data = self.process_vcf(args.input)
  File "/Users/teresapegan/helmsman/util.py", line 400, in process_vcf
    sequence = fasta_reader[row_chr]
  File "/Users/teresapegan/opt/miniconda3/envs/helmsman/lib/python3.6/site-packages/pyfaidx/__init__.py", line 999, in __getitem__
    raise KeyError("{0} not in {1}.".format(rname, self.filename))
KeyError: '1 not in MutSpect/Spalm_arbitrary_reference.fasta.'

It turns out this was simply because my VCF had a chromosome label of "1" and my reference fasta had a more complex chromosome label with species name in it. When I changed the reference fasta chromosome label, then helmsman worked.

Hope this is helpful, -Teresa

TeresaPegan avatar Jul 04 '22 17:07 TeresaPegan