blueoil icon indicating copy to clipboard operation
blueoil copied to clipboard

error could not find terminal under horovodrun training

Open joelN123 opened this issue 4 years ago • 0 comments

This occurs when using horovodrun, when running train command:

docker run --runtime=nvidia -e CUDA_VISIBLE_DEVICES="0,1" -v `pwd`:/home/blueoil -v <dataset dir>:/home/blueoil/dataset:ro --user=$(id -u):$(id -g) <docker image> horovodrun -np 2 python blueoil/cmd/main.py train -c blueoil/configs/core/classification/lmnet_quantize_cifar100.py -e testing123 --recreate

causes error:

[1,0]<stderr>:Traceback (most recent call last):
[1,0]<stderr>:  File "blueoil/cmd/main.py", line 23, in <module>
[1,0]<stderr>:    from blueoil.cmd.init import ask_questions, save_config
[1,0]<stderr>:  File "/home/blueoil/blueoil/cmd/init.py", line 21, in <module>
[1,0]<stderr>:    import inquirer
[1,0]<stderr>:  File "/usr/local/pyenv/versions/python3.6/lib/python3.6/site-packages/inquirer/__init__.py", line 6, in <module>
[1,0]<stderr>:    from .prompt import prompt
[1,0]<stderr>:  File "/usr/local/pyenv/versions/python3.6/lib/python3.6/site-packages/inquirer/prompt.py", line 3, in <module>
[1,0]<stderr>:    from .render.console import ConsoleRender
[1,0]<stderr>:  File "/usr/local/pyenv/versions/python3.6/lib/python3.6/site-packages/inquirer/render/__init__.py", line 2, in <module>
[1,0]<stderr>:    from .console import ConsoleRender
[1,0]<stderr>:  File "/usr/local/pyenv/versions/python3.6/lib/python3.6/site-packages/inquirer/render/console/__init__.py", line 9, in <module>
[1,0]<stderr>:    from inquirer import themes
[1,0]<stderr>:  File "/usr/local/pyenv/versions/python3.6/lib/python3.6/site-packages/inquirer/themes.py", line 9, in <module>
[1,0]<stderr>:    term = Terminal()
[1,0]<stderr>:  File "/usr/local/pyenv/versions/python3.6/lib/python3.6/site-packages/blessings/__init__.py", line 98, in __init__
[1,0]<stderr>:    self._init_descriptor)
[1,0]<stderr>:_curses.error: setupterm: could not find terminal

workaround

There's a couple of workarounds that seem to be fine in the short term. So, this is maybe not a high priority issue.

  1. set the TERM environment variable -e TERM=linux
  2. open the docker container in interactive mode, and do horovodrun from inside that.

solution

I'm not sure the best solution. I suppose one way is including the TERM environment variable to the dockerfile, to be used as standard. However, it's strange, when running Blueoil train without the horovodrun, and printing out the environment variables using os.environ, you can see that TERM is not included. So, it seems ultimately not necessary. Somehow the inquirer module gets this error under the horovodrun process, but not under the plain python process.

joelN123 avatar Jul 07 '20 05:07 joelN123