CAMISIM
CAMISIM copied to clipboard
How to map relevant genomes to OTU through scientific names?
command:
python metagenome_from_profile.py -p small_test/test.biom -c small_test/config_test.ini --ncbi ncbi/ -f --debug -ref small_test/ref.tsv -tmp small_test/tmp -o small_test/out/
warning:
the "ref.tsv" :
The scientific name in the ref.tsv file is obtained by searching the taxonomy of each OTU on NCBI. But it seems that they cannot map OTUs to the correct reference genome through them.
The mapping with scientific names currently only works with the online version of NCBI, the additional reference file is not used for the scientific name mapping. If you already know which genomes you want to use/which genomes map to which in your BIOM file, it is probably better to use the de novo simulation of CAMISIM and just set the abundances given in the profile - also see the last few answers of this issue for more details on how to do this.
Thank you very much, I understand what you mean now. By the way, if i don't know which genomes map to my BIOM file, how can I solve the following warning:
These warnings should not stop CAMISIM from running, they just show you that the NCBI mapping failed and CAMISIM uses the additional references to fill up your data set - which means that the data set might be less similar to your BIOM profile then desired. Another way to get a little bit more accuracy is to edit the file scripts/get_genomes.py
on line 47 and set MAX_RANK
to e.g. order
or class
(this seems to be the level where your genomes have "real" scientific names) - but that means CAMISIM might only find genomes of the same order
/class
as your BIOM profile genomes - which is quite the difference.
Now, if you know the mapping for some genomes and do not for others, you can run CAMISIM from profile with the community_only
option. This will yield a mapping file for all genomes, you can then replace this mapping for the genomes you do know the mapping and use CAMISIMs mapping for the others and use this as input for a de novo run. I hope this helps.
Thank you!
it seems that there are some new errors when i try the de_novo simulation command:python metagenomesimulation.py MP_simulation/config.ini --debug
.
the error is as follows:
This error is most likely to occur when there is a previous/unfinished CAMISIM run in your out
directory. Could you make sure that the out
directory is empty and try again?
Unfortunately, I have encountered this problem again.
Hi, i noticed the same problem in this issue. I tried simulating 1GB, 5GB, 20GB, and 50GB, but all reported the same error. I don't think this problem was posed by the deep/GB per sample .
Since this error occurs during anonymising, could you try running CAMISIM without anonymising and see if you still encounter errors?
how to run CAMISIM without anonymising?
Strangely, when I chose to simulate 10 species from all, there was no error.