Benjamin Buchfink

Results 445 comments of Benjamin Buchfink

It doesn't affect the function, you just need to be aware that they will be escaped as `\t` in the output.

> In your opinion, would such an approach lead to noticeable speed improvements. Depends on the size of your query files, I suggest testing it. > Is there a description...

This is due to the repeat masking, you need to use `--masking 0`.

There are some issues causing increased memory use that will be fixed in the next release. For now one thing you could try is using `--bin 256` (or possibly higher).

Another option would be `--cluster-steps faster_lin fast_lin`, that should be sufficient for 80% id cutoff.

DIAMOND is not configured to find very short hits by default. I shared some tips how to do this here: https://github.com/bbuchfink/diamond/issues/832

At the moment, you need to specify `qseqid`, not `cseqid`, on the command line. It is inconsistent and should be changed in a future version.

Please provide the command line you used to run diamond and your version.

These are not the files you provided. I ran diamond blastx of your `ncor_cdhit.fasta` against your `ncor_cdhit.fasta.transdecoder.pep` and it completed correctly.