Milot Mirdita

Results 432 comments of Milot Mirdita

I think the important part is to set `-a`. Me mentioning `--alignment-mode` was a bit misleading. The gap open count is computed based on the presence of a backtrace, which...

That's an interesting application of MMseqs2's clustering. It should be possible to do what you want however it will require much more parameter tweaking. Also did you generate your own...

No guarantees but that's why I would first try. Try k-mer sizes from 6 to 15. More things might go wrong though, as you are breaking some pretty fundamental assumptions....

Changing the alphabet size will cause it to use MMseqs2's built-in alphabet reduction. Since you seem to be trying various reduced alphabets I assume that you don't want to it...

Currently, everything is tailored to the NCBI taxonomy format (taxdump). For GTDB we transform their taxonomy to a names/nodes.dmp format). If your taxonomy is NCBI based, then you can just...

taxids are numeric. Your tax tree in the names and nodes.dmp needs to have full lineages up to the tree and also not have cycles. The labels themselves are not...

Yes, you can point `createtaxdb` to your existing database with `--ncbi-tax-dump` and `--tax-mapping-file` as described above. In fact that's how the `databases` commands work, they download sequences, create a db...

mapping is empty sounds like something went wrong while creating this tsv files I mentioned. Could you please write down the steps you took to generate the tax mapping?

Okay that might be correct, how does `MicroEuk100.eukaryota_odb10.lookup` look like?