make_prg icon indicating copy to clipboard operation
make_prg copied to clipboard

Ignore `N`s that might have slipped through when updating PRGs

Open leoisl opened this issue 2 years ago • 4 comments
trafficstars

When running the 4-way pipeline, updating the E coli PRG with illumina data, I got this error:

Traceback (most recent call last):
  File "/usr/local/bin/make_prg", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.9/site-packages/make_prg/__main__.py", line 94, in main
    args.func(args)
  File "/usr/local/lib/python3.9/site-packages/make_prg/subcommands/update.py", line 182, in run
    denovo_variants_db = DenovoVariantsDB(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 548, in __init__
    locus_name_to_denovo_loci = self._get_locus_name_to_denovo_loci()
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 532, in _get_locus_name_to_denovo_loci
    return self._get_locus_name_to_denovo_loci_core(filehandler)
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 522, in _get_locus_name_to_denovo_loci_core
    variants = self._read_variants(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 495, in _read_variants
    denovo_variant = cls._read_DenovoVariant(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 477, in _read_DenovoVariant
    denovo_variant = DenovoVariant(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 41, in __init__
    DenovoVariant._param_checking(
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 59, in _param_checking
    DenovoVariant._check_sequence_is_composed_of_ACGT_only(alt)
  File "/usr/local/lib/python3.9/site-packages/make_prg/update/denovo_variants.py", line 85, in _check_sequence_is_composed_of_ACGT_only
    raise DenovoError(f"Found a non-ACGT seq ({seq}) in a denovo variant")
make_prg.update.denovo_variants.DenovoError: Found a non-ACGT seq (N) in a denovo variant

There are 36416 new variants found, and only a single one has N in it. I'd very much prefer to simply issue a warning here: https://github.com/iqbal-lab-org/make_prg/blob/c2c7ff0f40dabc5034e2bfad7d6c2229e5649b09/make_prg/update/denovo_variants.py#L85 than erroring out and not being able to update

leoisl avatar Jun 07 '23 10:06 leoisl

Done in https://github.com/iqbal-lab-org/make_prg/commit/46534bccf4c6a347ef17707747da3e6021134686, testing in 4way

leoisl avatar Jun 07 '23 15:06 leoisl

How do we get an N in a novel variant?

mbhall88 avatar Jun 07 '23 22:06 mbhall88

For some reason racon output a N in a consensus sequence...

leoisl avatar Jun 09 '23 17:06 leoisl

Weird...

mbhall88 avatar Jun 13 '23 00:06 mbhall88