GetOrganelle icon indicating copy to clipboard operation
GetOrganelle copied to clipboard

animal_nr

Open dleopold opened this issue 2 years ago • 7 comments

I am wondering if there is a reason the there is no animal_nr seed database. If I want to assembly rRNA gene regions for animals, would I need to build my own? Or is GetOrganelle not an good tool for this?

dleopold avatar Feb 16 '22 23:02 dleopold

Thanks for reaching out! This is a good question because no one has requested it yet, which is also a quick answer to your question. I would not expect animal_nr to be very different. We can put this on the schedule if you want.

We originally designed GetOrganelle, as the name says, for organelle genome assembly. We added plant_nr simply because we are plant guys. And some of my colleagues studied fungi and requested fungus_nr. They verified the fungus_nr database I made, which I was unfamiliar with, to be helpful. GetOrganelle can be potentially efficient for calling any high-copy regions in WGS sequencing data, so I further added an anonym mode for more diverse purposes, where you can find your temporary solution if I cannot make the update timely, as always.

Another excuse for not having explored more on nuclear ribosomal RNA is the incomplete concerted evolution issue, which can be painful for many taxa. We even did not include it in our GetOrganelle paper. We can be motivated if more people are interested, though.

Kinggerm avatar Feb 17 '22 06:02 Kinggerm

It would be great to add animal_nr to GetOrganelle. I will probably give the anonym approach a try sometime soon, but would be happy to contribute to the overall project. If I develop a database that works, is there a mechanism to compile the results into a shareable database that could be used by others?

dleopold avatar Feb 18 '22 17:02 dleopold

It will be great if you want to contribute to the community with an animal_nr database.

GetOrganelleDB is where GetOrganelle pulls the default database and where you may want to fork and share the database. For this, I just added a section https://github.com/Kinggerm/GetOrganelleDB#how-to-contribute to the GetOrganelleDB repository.

Kinggerm avatar Feb 18 '22 21:02 Kinggerm

I can confirm that using -F anonym with custom -s and --genes reference files has worked pretty well in my particular case, It would be terrific if this was added as a standard feature.

gacsinger avatar Mar 04 '22 22:03 gacsinger

Hello, I was wondering about the same issue and, as an animal person, I would love to see updates with animal_nr in the future!

timz0605 avatar Dec 01 '22 17:12 timz0605

@gacsinger Hello, I am wondering what fasta file you are using as the seed and what file you are using as the label database?

timz0605 avatar Dec 01 '22 17:12 timz0605

Hi people, wanted to ask if there has been any update since the last comment of @gacsinger? Would you still use the anonym function to assemble Ribosomal gene regions in animal WGS data or has there been some sort of standard feature implementation?

max-baer avatar Feb 22 '23 14:02 max-baer