cdhit
cdhit copied to clipboard
clstr2txt.pl
Hi
I have one issue , I am doing hierarchical cd-hit from 80 to 60 then form 60 to 30, then i used clstr2txt to combines multiple cluster files from a hierarchical cd-hit.I have difficulty to read clstr2txt output, How I can get the clusters with 30% identity, should I select that sequences that has clstr_iden less or equal han 30% ?
id clstr clstr_size length clstr_rep clstr_iden clstr_cov Q8WZ42 0 2 34350 1 100 100% O97791 0 2 2000 0 74.70% 5% Q8NF91 1 1 8797 1 100 100% Q03001 2 2 7570 1 100 100% Q03001 2 2 7570 0 100.00% 100%