jobTree icon indicating copy to clipboard operation
jobTree copied to clipboard

Grid engine support for terabyte (T) MEMTOT output from qhost, and cpu specifications

Open cooketho opened this issue 9 years ago • 1 comments

The function obtainSystemConstants() in the GridEngineBatchSystem class in batchSystems/gridengine.py threw the error "ValueError: invalid literal for float(): 1.5T" when I tried to run it on a system that has 1.5T of available memory. I modified the MemoryString class to handle qhost output in the terabyte (T) range.

jobTree then worked fine, but the jobs it submitted to sge sat in queued "qw" state indefinitely. The reason was it was requesting a single processor per node via "qsub -l num_proc=1", but none of the nodes on my system have exactly one processor (they have more than that). I modified the prepareQsub(cpu, mem) function to use "qsub -pe shm 1". This now works on my system, but the function might have to be generalized to work on others (if something other than the shm parallel environment is being used).

cooketho avatar Sep 20 '15 20:09 cooketho

Thank you for the pull request. jobTree is now Toil and is maintained in a different repository. We are working to integrate your changes to Toil.

hannes-ucsc avatar Oct 07 '15 01:10 hannes-ucsc