dnadesign
dnadesign copied to clipboard
Megamash cannot call complicated sequences
seq_recovery=0.4152_0
for example, cannot be processed into a megamash table because it doesn't have any unique 16mers. This is because of a flawed assumption: ESPECIALLY in protein variant libraries, 16mer windows won't necessarily be unique - and even if a sequence is unique, a 16mer sliding window may not be able to pick it up. This is a real issue.
I'm still trying to figure out how to fix this.