MayomicsVC icon indicating copy to clipboard operation
MayomicsVC copied to clipboard

Increasing number of cores for running GATK4 tools.

Open ambarishK opened this issue 6 years ago • 3 comments

Hi! I am very much interested and working over increasing scalability of GATK4 tool performances. There is explicit parameter setting for increased number of nodes over multi-node cluster. But currently I am using single node SPARK cluster and want to check for the GATK4 performances over multi-core machine. What parameter I have to deal with to increase the allotted number of cores?

Waiting for your reply.

ambarishK avatar May 25 '19 14:05 ambarishK

Hi! Your work looks interesting. Our gatk4 pipeline (in the dev-gatk branch) uses the non-spark invocations of GATK4 tools. All these tools run single threaded with the exception of the HaplotypeCaller. We provide access to the threads of the HaplotypeCallervia theHaplotyperThreads` parameter. Is this what you are looking for?

azzaea avatar May 25 '19 19:05 azzaea

Yes. It will help me to deal with non-SPARK GATK tools especially HaplotypeCaller. Also, I found necessary parameters for spark based tools. Thank you so much.

Could I get your emailId for further correspondence.

ambarishK avatar May 26 '19 08:05 ambarishK

I think you might find this paper handy too; but have it your way.

One advantage of github issues is that your post can be seen by more than one person, and hence a quicker response. If more convenient though, this is my email: azzaea(at)gmail.com.

azzaea avatar May 27 '19 19:05 azzaea