odp
odp copied to clipboard
find duplicate protein substrings with cd-hit
use cd-hit to find duplicate protein substrings in the input protein files
use this for "best" filtering option
needs these files: cd-hit cdhit.c++ cdhit-common.h cdhit-common.o cdhit.o cdhit-utility.o Makefile license.txt