Daniel Lundin

Results 31 comments of Daniel Lundin

Just to let you know, @jfy133: It's been a month, and I still don't see when I'll have time for this. If you don't either, no problem, I'll get to...

IMO, "sbdi-gtdb" is better than "gtdb" as we know there are rRNA-sequences in the GTDB collection that are assigned to the wrong species. "sbdi-gtdb" is phylogenetically vetted to remove these.

> Thank you so much Daniel! > > Now that I think about it, it may be a matter of different GTDB versions? For full-genome methods, we use the latest...

> That's precious. If I set up the repository to track the releases, will it be enough to be notified when it becomes available? New releases of databases are included...

Try without the ` -with-conda`. Since you're running with `-profile docker`, Nextflow will use Docker for software, so I imagine the `-with-conda` flag might confuse. Next time, for quicker responses,...

Try installing Nextflow natively instead of in a Conda environment, i.e. deactivate the Conda environment and follow the instructions here: https://www.nextflow.io/ (under "Getting started").

Sounds like a good suggestion, although I'm not very fond of assuming tables are sorted correctly. I always use dplyr's `inner_join()` etc. to join on a key.

> Any insight here [@erikrikarddaniel](https://github.com/erikrikarddaniel) ? Actually not, although I assigned it to myself and then forgot, sorry for that. I just checked the obvious -- if there's a param...

It's relatively easy to add a database, so maybe you could contribute this yourself? You need to provide one or two urls for download and a formatting script that outputs...

You can view all formatting scripts in the `bin` directory of the pipeline. The files look like the below. `assignTaxonomy.fna`: ``` >Bacteria;Proteobacteria;Alphaproteobacteria;Rickettsiales;Rickettsiaceae;Rickettsia;Rickettsia felis TGAGAGTTTGATCCTGGCTCAGAACGAACGCTATCGGTATGCTTAACACATGCAAGTCGGACGGACTAATTGGGGCTTGCTCCAATTAGTTAGTGGCAGACGGGTGAGTAACACGTGGGAATCTGCCCATCAGTACGGAATAACTTTTAGAAATAAAAGCTAATACCGTATATTCTCTACAGAGGAAAGATTTATCGCTGATGGATGAGCCCGCGTCAGATTAGGTAGTTGGTGAGGTAACGGCTCACCAAGCCGACGATCTGTAGCTGGTCTGAGAGGATGATCAGCCACACTGGGACTGAGACACGGCCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGGACAATGGGCGAAAGCCTGATCCAGCAATACCGAGTGAGTGATGAAGGCCCTAGGGTTGTAAAGCTCTTTTAGCAAGGAAGATAATGACGTTACTTGCAGAAAAAGCCCCGGCTAACTCCGTGCCAGCAGCCGCGGTAAGACGGAGGGGGCTAGCGTTGTTCGGAATTACTGGGCGTAAAGAGTGCGTAGGCGGTTTAGTAAGTTGGAAGTGAAAGCCCGGGGCTTAACCTCGGAATTGCTTTCAAAACTACTAATCTAGAGTGTAGTAGGGGATGATGGAATTCCTAGTGTAGAGGTGAAATTCTTAGATATTAGGAGGAACACCGGTGGCGAAGGCGGTCATCTGGGCTACAACTGACGCTGATGCACGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAGTGCTAGATATCGGAAGATTCTCTTTCGGTTTCGCAGCTAACGCATTAAGCACTCCGCCTGGGGAGTACGGTCGCAAGATTAAAACTCAAAGGAATTGACGGGGGCTCGCACAAGCGGTGGAGCATGCGGTTTAATTCGATGTTACGCGAAAAACCTTACCAACCCTTGACATGGTGGTCGCGGATCGCAGAGATGCTTTCCTTCAGCTCGGCTGGACCACACACAGGTGTTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCATTCTTATTTGCCAGCGGGTAATGCCGGGAACTATAAGAAAACTGCCGGTGATAAGCCGGAGGAAGGTGGGGACGACGTCAAGTCATCATGGCCCTTACGGGTTGGGCTACACGCGTGCTACAATGGTGTTTACAGAGGGAAGCAAGACGGCGACGTGGAGCAAATCCCTAAAAGACATCTCAGTTCGGATTGTTCTCTGCAACTCGAGAGCATGAAGTTGGAATCGCTAGTAATCGCGGATCAGCATGCCGCGGTGAATACGTTCTCGGGCCTTGTACACACTGCCCGTCACGCCATGGGAGTTGGTTTTACCTGAAGGTGGTGAGCTAACGCAAGAGGCAGCCAACCACGGTAAAATTAGCGACTGGGGTGAAGTCGTAACAAGGTAGCCGTAGGGGAACCTGCGGCTGGATTACCTCCTTA ``` I.e. each sequence's name is...