Milot Mirdita

Results 432 comments of Milot Mirdita

The first column of both these files needs to be 1-to-1 mappable with each other (that's what `createtaxdb` does internally).

That does look wrong. Each entry in the `.lookup` (e.g. `bf501444362aba95481a93b6b17ab32c`, the first one), needs to have an entry in your `taxid.map`: E.g. if this was a human sequence: ```...

And to further clarify: The `bf501444362aba95481a93b6b17ab32c` is extracted from the FASTA header. Usually the first word before any whitespace after the `>`.

I didn't try seqkit for taxonomy database building. You can also map named based on the `.source` files with `--tax-mapping-mode 1`. This will match your `taxid.map` to the `.source` instead...

Does the crash also happen with a smaller `max-seqs` (currently its set to `--max-seqs 1000000`)? Are the failed proteins on the query side? Do these queries also crash against a...

It doesn't look like the MSAs for the KOFAM profiles are available for download, we can't create profiles for MMseqs2 without the original MSAs. I don't have specific recommendations for...

We need to run the [format_substitution_matrix.R](https://github.com/soedinglab/MMseqs2/blob/master/util/format_substitution_matrix.R) first. However, I think the script is hardcoded to bit/2 currently (~~the fixme line~~, i think we have bigger issues here). The current VTML160...

I am not sure if `

Also mmseqs can read gzip files directly, you shouldn’t need either piping or `

What's the length of the reads? We demand a minimum length for extracted ORFs of 30 AA, you can adjust this with the `--min-length`. Currently its extracting 0 ORFs and...