Problem starting ktserver

Open mikolmogorov opened this issue 10 years ago • 1 comments

I am running progressiveCactus on a single sever machine (quite old one) and it is stuck with the following cactus.log:

The job seems to have left a log file, indicating failure: /home/mkolmogo/tools/progressiveCactus/test_dir/jobTree/jobs
/t0/job
Reporting file: /home/mkolmogo/tools/progressiveCactus/test_dir/jobTree/jobs/t0/log.txt
log.txt:        ---JOBTREE SLAVE OUTPUT LOG---
log.txt:        Traceback (most recent call last):
log.txt:          File "/home/mkolmogo/tools/progressiveCactus/submodules/jobTree/src/jobTreeSlave.py", line 271, in main
log.txt:            defaultMemory=defaultMemory, defaultCpu=defaultCpu, depth=depth)
log.txt:          File "/home/mkolmogo/tools/progressiveCactus/submodules/jobTree/scriptTree/stack.py", line 153, in execute
log.txt:            self.target.run()
log.txt:          File "/home/mkolmogo/tools/progressiveCactus/submodules/cactus/pipeline/ktserverJobTree.py", line 139, in run
log.txt:            killPingInterval=self.runTimestep)
log.txt:          File "/home/mkolmogo/tools/progressiveCactus/submodules/cactus/pipeline/ktserverControl.py", line 130, in runKtserver
log.txt:            raise e
log.txt:        RuntimeError: Unable to launch ktserver.  Server log is: /home/mkolmogo/tools/progressiveCactus/test_dir/progressiveAlignment/Anc7/Anc7/Anc7_DB/ktout.log
log.txt:        Exiting the slave because of a failed job on host debruijn.ucsd.edu
log.txt:        Due to failure we are reducing the remaining retry count of job /home/mkolmogo/tools/progressiveCactus/test_dir/jobTree/jobs/t0/job to 0
log.txt:        We have set the default memory of the failed job to 34359738368 bytes
Job: /home/mkolmogo/tools/progressiveCactus/test_dir/jobTree/jobs/t0/job is completely failed

Where "test_dir/progressiveAlignment/Anc7/Anc7/Anc7_DB/ktout.log" says:

2015-04-08T18:12:22.229619-08:00: [SYSTEM]: ================ [START]: pid=4635
2015-04-08T18:12:22.229743-08:00: [SYSTEM]: opening a database: path=:#opts=ls#bnum=30m#msiz=50g#ktopts=p
2015-04-08T18:12:22.229923-08:00: [SYSTEM]: starting the server: expr=137.110.243.14:2084
2015-04-08T18:12:22.229973-08:00: [ERROR]: socket error: expr=137.110.243.14:2084 msg=bind failed

But if I change __getHostName function in "submodules/cactus/pipeline/ktserverControl.py" simply to:

return "127.0.0.1"

everything works.

And it actually seems that it is the system's issue: for some reason it returns a wrong ip address associated with a hostname (the machine has a outer-accessable hostname, so it might be the case). But still I want to mention this here in case if somebody will face with that trouble (and maybe it is possible for cactus to somehow handle that automatically).

Apr 09 '15 19:04 mikolmogorov

Thanks.

Nov 29 '17 23:11 iminkin