Martin Steinegger

Results 234 comments of Martin Steinegger

I added a parameter to control the length `--contamination-len`

We currently predict contamination just for shore sequences of length < 20kb. The 20kb can be in scaffolds or just single sequences. I assume you have just one long sequence?

The `_all` report should contain all the local alignments with cross kingdom hits (--kingdom). This could be used to filter for longer sequences. Can you find the C.elegans and E.coli...

Yes, I agree. I had this on my todo list for quite some time. :( But currently I am quite flooded with work.

Could you please provide your cmake output as well?

The database module should allow you to download the GTDB database. It will build `names.dmp` and `nodes.dmp` based on the GTDB taxonomy.

This should be fixed now. I updated conterminator to the newest version of MMseqs2, which should resolve the issue.

Thank you @pmenzel I will upload the results from the NR to the FTP tomorrow.

Sorry for the delay. I have added the NR files to the ftp `ftp://ftp.ccb.jhu.edu/pub/data/conterminator` There are two files (1) `nr.ids.gz`, which only contains the identfier and (2) `nr.gz`, which shows...

Thank you for catching this! The reported number in the paper is from the kraken report. I have lost some entries while converting the conterminator result to a kraken output...