Benjamin Buchfink

https://twitter.com/bbuchfink [email protected]

Max Planck Institute for Biology Tübingen Tübingen, Germany

Results 445 comments of


                                            Benjamin Buchfink

Linclust does not respect memory limit at the end of clustering.

https://github.com/bbuchfink/diamond/commit/8ec818ca160fbc26b262ae999c5c11d9c98a7e38 You can use `--oid-output` to write oids instead of accessions into the output file. These are the sequences linearly numbered in the input file starting from 0. You can...

Linclust does not respect memory limit at the end of clustering.

https://github.com/bbuchfink/diamond/wiki/How-to-cluster-huge-datasets @beazerj The latest release has a new feature to run linclust in parallel on multiple nodes. May be interesting for you. Sensitivity should also have substantially improved, and you...

Diamond with database of small kmers

Diamond does not work well by default on very short sequences and needs to manual parameter tuning. I shared some tips here: https://github.com/bbuchfink/diamond/issues/832 https://github.com/bbuchfink/diamond/discussions/469 and in some more issues.

Diamond parameters to run on HPC

`-c1` is good, you can try a higher block size like `-b6`, if you can assign more memory to a task. 32 threads per task seems reasonable but could be...

Diamond parameters to run on HPC

-b8 should be slightly faster than -b6 but the gains are probably pretty marginal. The parameter of -g is the number of targets that will be extended for each query,...

Diamond parameters to run on HPC

Another hint: the best way to speed this up would be to first cluster the database. Diamond now has the feature to do it.

Diamond parameters to run on HPC

> Is the --approx-id parameters a approximation similar to CDHIT -c parameter ? (identity threshold) Yes. >Has someone already benchmarked the clustering of the NR database ? This is the...

BLAST NR Support Self-compiled libraries and BLAST version 2.16

Make sure to have `libsqlite3-dev` installed on the system prior to compiling BLAST.

BLAST NR Support Self-compiled libraries and BLAST version 2.16

To include blast db support in the conda version, I would have to depend on the bioconda blast package. This is not available for the Linux and macOS ARM64 architectures...

BLAST NR Support Self-compiled libraries and BLAST version 2.16

BLAST database support is now available for the conda version since v2.1.12.

‹
1
2
...
36
37
38
39
40
41
42
43
44
45
›