Benjamin Buchfink
Benjamin Buchfink
It should work using `--id 100` because that is applied to the unrounded number. If it does not, can you send me a test case? Also, you can use `-f...
The --header option has been included in the latest release. It prints a description of the columns and also the diamond version and invocation. Feel free to check it before...
Ok, I see your point about the header format. Changing things that break compatibility with older versions is also problematic, so I will probably add something like `--header 2` to...
The option does work for me. Check your version using `diamond version` and upgrade if necessary.
To compute the query blocks, take the number of DNA letters in the input file * 2, divided by the block size (2000000000 in your case).
Yes seems correct. The easiest way to reduce runtime would to be used a smaller database if that works for you, e.g. the UniRef50 or annottree, see here: https://journals.asm.org/doi/full/10.1128/msystems.01408-21 To...
Ok thanks, I'll see what I can do.
Hi Markus, as you have noted correctly, Diamond is optimized to be used with large query files. If you use 1,000,000 proteins as input, you will surely get a big...
Hi Oliver, Diamond was not designed for this use case of >90% identity hits only, so I'm pretty sure that substantial speedups would be possible there. Simply building a faster...
Yes that's probably a good idea.