abPOA icon indicating copy to clipboard operation
abPOA copied to clipboard

[simd_abpoa_align_sequence_to_subgraph1] Error in cg_backtrack. (4)

Open ekg opened this issue 3 years ago • 17 comments

I finally found a small reproducible example of an alignment problem.

To reproduce on this input FASTA, fail_smoothxg_block_3055.fa.txt:

abpoa -s -r 3 fail_smoothxg_block_3055.fa
[simd_abpoa_align_sequence_to_subgraph1] Error in cg_backtrack. (4)

ekg avatar Oct 06 '20 13:10 ekg

Thank you @ekg for providing this example. This is a bug related to the banding. So disable banded DP (set b as -1) is the easiest way to get rid it. Also I am trying to figure out how to fix it.

The banded DP is more fragile when the lengths of sequences differ too much, like this data: 264 vs 443. This happened previously, I thought I fixed it.

yangao07 avatar Oct 06 '20 13:10 yangao07

Any workaround (even dropping into non-banded mode when this happens) would be helpful! What do you suggest?

Running everything non-banded to avoid this issue would be expensive.

ekg avatar Oct 07 '20 08:10 ekg

I'm also running into this issue when using the python API. Instead of being able to handle the error, the thread that gets the error just hangs since this error kicks you out of python. Is there a way for me to be able to handle this error through the python API? Disabling adaptive banding takes too long.

rvolden avatar Oct 13 '20 19:10 rvolden

There is now a flag on the result object that indicates if the traceback was OK. It's not propagated to python.

On Tue, Oct 13, 2020, 21:15 Roger Volden [email protected] wrote:

I'm also running into this issue when using the python API. Instead of being able to handle the error, the thread that gets the error just hangs since this error kicks you out of python. Is there a way for me to be able to handle this error through the python API? Disabling adaptive banding takes too long.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yangao07/abPOA/issues/9#issuecomment-707953248, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABDQEIFJ6QIILWQUJCRURDSKSRO5ANCNFSM4SGARACA .

ekg avatar Oct 13 '20 19:10 ekg

@rvolden As mentioned by Erik, this new flag is not added in pyabpoa right now, I will get it done sone. However, I didn't implement the ambiguous strand mode in python for now. So I guess what you met is different from what Erik posted here. Can you share with me the sequences that cause the error? That would very helpful. Thanks.

yangao07 avatar Oct 14 '20 01:10 yangao07

Anyway, the ultimate goal is to fix this bug instead of just break the loop and not provide any alignment result. I am working on that.

yangao07 avatar Oct 14 '20 01:10 yangao07

You're right, the traceback error is 2, not 4. It's for a pairwise alignment where one has a long polyA but the other doesn't. I'm including the initialization as well as the sequences here

poa_aligner = poa.msa_aligner(match=5, extra_b=16) # anything lower for extra_b throws the traceback error
res = poa_aligner.msa(subreads, out_cons=False, out_msa=True)
# errors out here
>0
CTGACATTTCGGTGGAGAATTTTTTTATATTTGTATTCTCAGCGTAAAGTCTCCCCTGGATATATTTGTGTTTATGCTGATATTGGCATCCATGTTTGACGGAGGATTATCAGGTAGGTAAATTACTTCATTTGGAGATGAGGTGGTTGTACATTAACTTCCCTCCTCC
TATATTGACTAGCCTTCAACTGGTTCTAAGCAGTGGTATCAACGCAGAGTACATGGGGATTCCTGAAGCTGACAGCATTCGGGCCGAATGTCTCGCTCCGTGGCCTAGCTGTGCTCGCGCTTCTCTCTCTTTCTGGCCTGGAGGCTATCAGCGTACTCCAAAGATTCAGGT
TTACTCACGTCATCACAGAGAATGGAAAGTCAAATTTCCTGAATTGCTATGTGTCAGGTTTTCATCCATCCGACATTGAAGTTGACTTACTGAAGAATGGAGAAGAATTGAAAAGTGGAGCATTCAGACTTGTCTTTCAGCAAGGACTGGTCTTTCTATCTCTTGTACTAC
ACTGAATTCACCCCCACTGAAAAGATAGGTATACTGCCATGTAGAACCATGTGACTTTGTCACAGCCCAAGATAGTTAAGTGGGATCCGAGACATGTAAGCAGCATCATGGAGGTTTGAAGATGCCGCATTTGGATTGGATGAATTCAAATTCTGCTTGCTTGCTTTTTAA
TATTGATATGCTTATACACTTACACTTTATGCACAAAATGTAGGGTTATAATAATGTTAACATGGACATGATCTTCTTTATAATTCTACTTTGAGTGCTGTCTCCATGTTTGATGTATCTGAGCAGGTTGCTCCACAGGTAGCTCTAGGAGGGCTGGCAACAGAGGTGGGA
GCAGAGATTCTCTTATCCAACATCAACATCTTGGTCAGATTTGAACTCTTCAATCTCTTGCACTCAAAGCTTGTTAAGATAGTTAAGCGTGCATAAGTTAACTTCCAATTTACATACTCTGCTTAGAATTTGGGGGAAAATTTAAATATAGTTGAACCCAGGATTATTGGA
AATTTGTTATAATGAATGAAACATTTTGTCATATAAGATTCATATTTACTTCTTATACATTTGATAAAGTAAGGCATGGTTGTGGTTAATCTGGTTTTATTTTTGTTCCACAAGTTAAATAAATCATAAAACTTGAAAAAAAAAAAAAAAAAAAAAAAAAAAATAAAAAAA
AAAAAAAAAAAAAGTATTCCATAAGACTCTGCGTTGATACCACTGCTT
>1
CTGACATTTCGGTGGAGAATCTTATTATATCGTGCTTCTCAACTGTAAAGTCTCCCTGGATATATTTGTGTTTATGCTGATATTGGCATCCATGTTTGACAGAGGATTATCAGGTAGGT
AAATTACTTCATTTGGAGATGAGGTGGTTGTACATTAACTTCCCTAAATATATATCTTCAAGCCTTCAACTGAAAGTTCTAAGCAGTGGTATCAACGCGAGTCTTTTTATGGAATACTTATTGAACAGGTAATTCACTGTAATATTTATTAAGTGATGACTAGAGGGATAT
TGATAGATGTAAAAATTTTCACTCACAGTGAACATGAAACCTTTACACATGTAAGGTTTAGATTCTTTTTTTTTAATCTGCCCCTTTCAGATTATATCATGGTATATGAAGCACTGGTGAGGTCTATGTCACCAGAAATTCCCCAGTTTGCTGATTTGTTAGGTTTTTTAA
CCCGATGATTGTACTGCAACAAGTGAGCATCATTCACTGCAACCTTGAAGTGGTCAGGTTCAACCAGTACTTGTATTTTGAATGGTTTCCCACTTTCAAATGGGAAAACCGACTGTCTTTCTTCCCTTCCCCAGTTATTATCCAGCTTTGTATTGCCAAACAATGACTCTC
CTGTTGTTCTCATTGAAGCGTGGGTTAAAGTGGAAGGCAACATCATTCCCTCTTTGGAAATCTAAAGCAATTCTGTTTGCATTGGGCTTCACCGTGCCCAGAATTGTTATCAGCATGCGAGGCACCACTCCCCGGTAAAGAGAGCAGGTTATAAGGCACAATCAGTGGCCC
AGCAGGGGCGCCATAGGGGCCAGTGGCGGGAGTAGGCTCCGGTGGCACTTGGCTGTCCAGAAGATGGGTAGGCCCCAGGGCCGCTGGGTGGCCCTGGTGGGCTCCAGGTGCAGGTGCCGGGATAAGCTCCAGGTGCTCCAGGGTAGGCGCCTGGAGGTGCCTGGTCAGGAT
AGCCCCCTGGGGTGCCTGCCCGGGGTAGGCCCAGGATGGGGCCCTGGGTGGCCCCTGCCCCAGCAGGCTGGTTCCCCCATGCGCCAGGCTCGCCAGGGTTTGGGTTTCCAGACCCAGATAACGCATCATGGAGCGCTCGTTGGCTGGCTCCGGACGGCTGCTGGCGAGGAG
GTGCTGCGGGCCCCCCATGTACTCTGCGTTGATACCACTGCTTCT

rvolden avatar Oct 14 '20 03:10 rvolden

I modified the codes of the traceback part in the latest commit. Hopefully, this can resolve these bugs. I also removed the trackback_ok flag, since it is not needed if we can finish the traceback step.

@ekg @rvolden This works on the two sequence sets you guys provided here, please try it out on some other data.

Yan

yangao07 avatar Oct 14 '20 09:10 yangao07

Unfortunately, I still find cases that cause this error.

fail_smoothxg_block_9338.fa.txt

-> % abpoa -s -r 3 fail_smoothxg_block_9338.fa.txt
[simd_abpoa_align_sequence_to_subgraph1] Error in cg_backtrack. (4)

ekg avatar Oct 14 '20 19:10 ekg

I don't get the error for python2, but I get it in python3. The only modification I made to the makefile for python3 was to change the command to python3 instead of python:

  103 install_py: python/cabpoa.pxd python/pyabpoa.pyx python/README.md
~ 104 |   ${py_SIMD_FLAG} python3 setup.py install
  105 |   
  106 sdist: install_py
~ 107 |   ${py_SIMD_FLAG} python3 setup.py sdist #bdist_wheel

To clarify, this is python 3.6.9, and the error I get with the same sequences I provided is [simd_abpoa_align_sequence_to_subgraph1] Error in cg_backtrack. (2)

rvolden avatar Oct 14 '20 21:10 rvolden

Unfortunately, I still find cases that cause this error.

@ekg This is a different case and different type of bug. Working on that. Before I fix it, you probably want to roll back to the version where you added the traceback flag.

yangao07 avatar Oct 15 '20 01:10 yangao07

I don't get the error for python2, but I get it in python3.

Nothing was changed related to the python side. Did you re-install pyabpoa in python3?

yangao07 avatar Oct 15 '20 01:10 yangao07

Yeah, I reinstalled using pip3 for python3

rvolden avatar Oct 15 '20 01:10 rvolden

Yeah, I reinstalled using pip3 for python3

These changes haven't been pushed to the pypi. So pip3 install will give you the old one. To install locally from source, try make install_py or python3 setup.py install.

yangao07 avatar Oct 15 '20 01:10 yangao07

I should've been a bit more clear. I did make install_py after modifying the make file. When it didn't work, I tried reinstalling using pip, and I also tried python3 setup.py install, which also throws the traceback error

rvolden avatar Oct 15 '20 02:10 rvolden

This really sounds weird to me. Also, it works on my pc when I install with python3. Maybe you can try to remove everything and reinstall it.

yangao07 avatar Oct 15 '20 02:10 yangao07

Removed everything previously installed. It's working now. Thank you!

rvolden avatar Oct 15 '20 04:10 rvolden