Focus sequence non-functional if alphabet provided

Open njrollins opened this issue 5 years ago • 1 comments

When alphabet is specified, all sequence positions are modeled ignoring uppercase/lowercase

Without alphabet ( assumes protein alphabet = a problem for RNAs )

plmc -c my.model -f target_seq -t 0.10  test.a2m
Found focus target_seq as sequence 1
0 valid sequences out of 1 
433 sites out of 450

With RNA alphabet = all positions are modeled, even lower-case

plmc -c my.model -f target_seq -t 0.10  -a -AUGC test.a2m
Found focus target_seq as sequence 1
1 valid sequences out of 1 
450 sites out of 450

Jul 12 '20 17:07 njrollins

This update should exclude lower-case from being modeled once a custom alphabet is provided: https://github.com/debbiemarkslab/plmc/pull/10#issue-469703712

Aug 18 '20 19:08 joshuaroll