Benjamin Buchfink comments

Results 445 comments of


                                            Benjamin Buchfink

Advice for diamond distributed-memory implementation on HPC

That seems very unusual, with these settings diamond shouldn't use more than 40 GB of memory, and run at maximum CPU% most of the time. Either you data behaves very...

Advice for diamond distributed-memory implementation on HPC

Maybe the MPI is causing the problem then, like spawning multiple instances of diamond on one host? Not sure since I'm not very knowledgeable about MPI.

alignment statistics

You could use the options `--log --no-heartbeat` which will create a `diamond.log` file containing this information. Yes, reporting unaligned queries also works for the tabular format.

blast alignment time up to 5X slower from v2.0.11 to v2.0.12

I can't confirm this issue testing your command line with a thaliana proteome. Could you send me your input file to check this?

blast alignment time up to 5X slower from v2.0.11 to v2.0.12

I'm seeing the same effect on your data. The difference occurs due to masking seeds based on complexity instead of frequency which was introduced in this version. Your dataset seems...

About out.xml too big

I'd recommend against using the XML format, but if you must you can try to compress the file with gzip, other than that I'm not sure how you would get...

I see, the `--long-reads` option is overriding your `--max-target-seqs` setting here. Don't use this but `--range-culling -F15` instead. With `--range-culling` you will still get multiple hits for a query if...

Error: Invalid output format. Allowed values: 0,5,6,100,101,102

I'm aware that this doesn't work, although it would probably be better if it did.

Status/progress bar?

That would certainly be a useful feature and I'll put it on my todo list. It is however not that simple to estimate the total progress. If you use the...

Retrieval more taxonomics IDs than the one present in the "prot.accession2taxid.FULL"

If you look up this protein with NCBI, you can see under identical proteins that there's an entry (`MBO4974725.1`) with taxon id 29523. These entries are merged if you use...