Martin Steinegger comments

Results 234 comments of


                                            Martin Steinegger

Error: result2flat died

This sounds like a filesystem error while closing the result file. Could it be possible that the system ran out of space?

mmseqs much slower than the MMseqs2 MSA server

Here you can read more about MMseqs2: https://github.com/soedinglab/MMseqs2/wiki

Crash in easy-cluster at prefilter step "no k-mer could be extracted"

Thank you for reporting this. I see whats happening here. The prefilter sets the k-mer threshold to 130. None of the k-mers in the sequence reaches that threshold. So no...

You can use the `--format-output` parameter to define your custom columns in the output format. Read more about this [here](https://github.com/soedinglab/mmseqs2/wiki#custom-alignment-format-with-convertalis).

Clustering billions of sequences gets stuck on kmermatcher

The maximal size for one clustering can not be more than (2^32 - 1), which is roughly 4 billion sequences. To cluster 16 billion you need some kind of step...

MMseqs for big data

MMseqs2 is optimized to process multiple queries at once. So it would make sense to package your search into a big fasta file. If you'd like to perform fast single...

Convert Hits from a Search to Fasta files

@juliacpowell1999 Yes you can get the target sequences by adding `tseq` to the `--format-output` options. For example: ``` easy-search query target result tmp --format-output query,target,tseq ```

Convert Hits from a Search to Fasta files

``` awk '{print ">"$2; print $3}' result > result.fasta ```

Cluster with greedy set cover method not reproducible.

The cluster order can be different in the output file. However the cluster itself should have the same members. Are the members changing or just the cluster order?

[Question] Would clusters of proteins from MMSEQS2 be considered orthologs?

No, we do not filter for orthology in our clustering. They are homologous.