Benjamin Buchfink

Results 445 comments of Benjamin Buchfink

Hi Nick, I'm aware of this issue and I'll try to support a dynamic length in future versions. For now, you can also change the max length by editing `src/data/taxonomy.h:45`...

If you make the files available to me, I can look into it.

This happens because the nodes.dmp is implicitly assumed to be sorted on the taxid. I will remove this restriction in a future release, but for now you can simply sort...

Did you use a BLAST database when running diamond v2? That would explain the different accessions. Note that for example NP_080749.2 and Q922F4 are the same proteins.

You listed the same file sizes for both runs, I assume that is an error? You can try to reproduce this problem on a smaller sequence set so I can...

Try to run `diamond view` also with `-k 250` when using v2.0.6, that may explain the difference in m8 file size.

I would guess this is due to the runtime repeat masking, so try running with `--masking 0`. Diamond is not very efficient for such small query files, but improvements in...

I'm not sure what else could be causing this difference. Optimizations for small query files are available but still in beta stage, as described here: https://github.com/bbuchfink/diamond/issues/419#issuecomment-831154792 It will probably be...

v2.0.11 now contains some optimizations for small query files. You can also get the old behaviour back using the option `--algo ctg`, which may or may not improve performance depending...

You can try using a lower `-c`, like `-c1 -b2` or `-c1 -b1.5` if the first one fails. Alternatively, you can try a bigger block size, like `-b4 -c4`. I'm...