Benjamin Buchfink comments

Results 445 comments of


                                            Benjamin Buchfink

retrieve hits that are identical match to the query?

It should work using `--id 100` because that is applied to the unrounded number. If it does not, can you send me a test case? Also, you can use `-f...

Add a --header option when --outfmt 6 is used

The --header option has been included in the latest release. It prints a description of the columns and also the diamond version and invocation. Feel free to check it before...

Add a --header option when --outfmt 6 is used

Ok, I see your point about the header format. Changing things that break compatibility with older versions is also problematic, so I will probably add something like `--header 2` to...

Add a --header option when --outfmt 6 is used

The option does work for me. Check your version using `diamond version` and upgrade if necessary.

How many processing query blocks are there in total?

To compute the query blocks, take the number of DNA letters in the input file * 2, divided by the block size (2000000000 in your case).

How many processing query blocks are there in total?

Yes seems correct. The easiest way to reduce runtime would to be used a smaller database if that works for you, e.g. the UniRef50 or annottree, see here: https://journals.asm.org/doi/full/10.1128/msystems.01408-21 To...

Please provide manpage for diamond

Ok thanks, I'll see what I can do.

diamond slow, compared to blastp ?

Hi Markus, as you have noted correctly, Diamond is optimized to be used with large query files. If you use 1,000,000 proteins as input, you will surely get a big...

diamond slow, compared to blastp ?

Hi Oliver, Diamond was not designed for this use case of >90% identity hits only, so I'm pretty sure that substantial speedups would be possible there. Simply building a faster...

Timing of output format error indication

Yes that's probably a good idea.