TransposonUltimate
TransposonUltimate copied to clipboard
Increasing the number of threads
Good afternoon! Please, could you help: I have a large Avena genome, which weight about 3 Gb. When I run ReasonateTE, this program takes just a huge amount of time. Could you gave me a clue how to increase the number of threads for running this tool? Thanks you in advance for your response.
Dear Ylia Solomennikowa, to help you better, I need to know at which of the steps you are using reasonaTE. Are you still at Step 2) Annotate genome with annotation tools? My following answers are for step 2:
I see two aspects for you here to increase your speed, but it depends where you, work probably on Linux Cluster?:
- What you could do is to execute reasonaTE on single tool mode, and execute the single tools on single threads / ssh sessions in parallel.
This means, instead of writing
conda activate transposon_annotation_tools_env
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool all
You could just run these in parallel with the "&" operator, however you need to wait until all sessions are finished before you go to the next steps:
conda activate transposon_annotation_tools_env
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool helitronScanner &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool ltrHarvest &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool mitefind &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool mitetracker &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool must &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool repeatmodel &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool repMasker &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool sinefind &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool sinescan &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool tirvish &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool transposonPSI &
reasonaTE -mode annotate -projectFolder workspace -projectName testProject -tool NCBICDD1000 &
- Especially the tools repeatmasker and repeatmodeler take a lot of time. You could also install and run them separately with their way of parallelizing (but this depends on your environment). Then after they finished you take their output and copy them into the folder structure of reasonaTE as describe on the reasonaTE page. According to: https://blaxter-lab-documentation.readthedocs.io/en/latest/repeatmodeler.html you could run repeatmasker with "-p" flag.
Please let me know which step you are, and we can find solutions to accelerate. Hope this could already help, Best regards and looking forward to your answer, Kevin
Good afternoon! Please, could you help: I have a large Avena genome, which weight about 3 Gb. When I run ReasonateTE, this program takes just a huge amount of time. Could you gave me a clue how to increase the number of threads for running this tool? Thanks you in advance for your response.
Dear Ylia Solomennikowa,
I have issue installing reasonaTE, how did you install RepeatMasker and RepeatModeler . Ducker use full memory.