Robert Edgar

Results 169 comments of Robert Edgar

Something like this usearch -cluster_fast pdb.fasta -id 0.5 -centroids centroids.fasta grep '^>' centroids.fasta | tr -d '>' > pdbids.txt for label in `cat pdbids.txt` do mv -v $label.pdb ../some_other_directory done

The current code requires AVX2 for speed, To fix this would require changes to the source code which would make Reseek quite a bit slower. So in theory this is...

also, I think it is unlikely sequences of length ~300k are globally alignable, I can't think of an example of biological sequences this long which don't have re-arrangement events

Any ensemble should work. I would tend to prefer diversified because it's more... diverse :-)

Not sure how this happened, but you are right, it seems there is a mixup between two different representations of the letter confidence, sorry for the inconvenience! A work-around is...

Thanks for the feedback. I'm aware of the "seemingly random" order issue. I was thinking the output should be sorted to bring similar sequences together (e.g. this can be done...

👍good suggestion, I will add in the next release. If you're up for editing a few lines of C++ and recompiling, I can explain the patch here, it's pretty simple.