ariba
ariba copied to clipboard
Filter Bad MLST Profiles from PubMLST
Hi,
Recently, i notice that few Profile like ST-2724 in "Acinetobacter baumannii#1" schema or ST-2609 in "Haemophilus influenzae" schema contains "N" letter index instead of allele index.
So i got this error when i want to download theses schemas...
ariba pubmlstget "Haemophilus influenzae" mlst_hinfluenza_test
WARNING: spades not found in path. Looked for spades.py
Traceback (most recent call last):
File "/usr/local/bin/ariba", line 312, in <module>
args.func(args)
File "/usr/local/lib/python3.8/dist-packages/ariba/tasks/pubmlstget.py", line 11, in run
preparer.run()
File "/usr/local/lib/python3.8/dist-packages/ariba/pubmlst_ref_preparer.py", line 81, in run
self.profile = mlst_profile.MlstProfile(profile_file, duplicate_warnings=True)
File "/usr/local/lib/python3.8/dist-packages/ariba/mlst_profile.py", line 15, in __init__
self._load_input_file()
File "/usr/local/lib/python3.8/dist-packages/ariba/mlst_profile.py", line 29, in _load_input_file
type_tuple = tuple(int(row[x]) for x in self.genes_list)
File "/usr/local/lib/python3.8/dist-packages/ariba/mlst_profile.py", line 29, in <genexpr>
type_tuple = tuple(int(row[x]) for x in self.genes_list)
ValueError: invalid literal for int() with base 10: 'N'
Maybe, it can possible to check if a no numeric value is present and remove the corresponding ST profile...
Thanks in advance.