Milot Mirdita

Results 429 comments of Milot Mirdita

You have to specify the `--report-mode 1` parameter to generate Krona output. Splitting `easy-taxonomy` up makes generating both outputs at the same time easier: ``` mmseqs createdb 00.rawdata/ccs.fasta 00.rawdata/M1.ccs.fasta qdb...

You need a machine with 1TB of ram to create a pre computed index for the ColabFoldDB. Are you actually planning to run a lot of small queries (like the...

Then I would recommend to delete the already created precomputed index (`rm *.idx*`) and just use `colabfold_search` without the precomputed index.

This was also a while ago, however for clustering you should pretty much always supply a sequence identity threshold with `--min-seq-id`. The cascaded clustering of MMseqs2 can still put together...

That seems about right? It aligned two residues successfully (from 83 to 84). You might want to demand some coverage thresholds (`-c/-cov-mode`) or a minimum aln length threshold (`--min-aln-len`).

#557 is the same issue. We'll think of something.

`--db-load-mode` won't help in this case. The parameter handles loading of precomputed indices of (search) databases. Normally, we don't use precomputed indices for clustering. Ideally the `tmp` folder should be...

I am not sure we deal well with 50-mers, the default nucleotide k-mer size is 14 or 15 (depending on the database size). Also, we have predefined spaced-kmer patterns only...

I thought that `-mavx2` would imply (most) lower SSE levels. We also use an SSSE3 instruction in some important place (iirc), so should we also enable that explicitly? (Edit: we...

This was also a while ago. For your use-case I would only call `easy-linclust`. You won't benefit from the deeper clustering at a seq. id. threshold of 98%. That should...