ai2thor
ai2thor copied to clipboard
Unity exits error after a few hours running of simulator
Hi @ekolve @Lucaweihs @mattdeitke
Thanks for your great work on this simulator!
I use it for RL training and run it on a remote Ubuntu server without a physical monitor. It works fine with the startx()
as virtual display buffer. However, when I run 4 processes at the same time in 4 different terminals, it runs good for a few hours, however, finally, it will crash in some random sense and give the following error:
Unity exit error, please check Player.log, the last action is xxxx
and the error happens at the code line related to fifo_server()
about receiving messages.
And when I check the Player.log, at the bottom line, I see an error [XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" after 1235 requests (466 known processed) with 0 events remaining]
.
I also noticed that when everything works good (not crashed yet), from the htop
result, it creates far more than 4 subprocesses named /.ai2thor/release/thor-linux64-xxxxxxx
.
Anyone has encountered similar questions? Is there any solution for this. This is quite annoying after you run a program for a long time but not finished yet, and everything crashed and you have to start over again.
Thank you! Xubo
Could you try to start Xorg using the following script:
https://raw.githubusercontent.com/allenai/ai2thor/main/scripts/ai2thor-xorg
That contains a few fixes that may address what you are seeing. That script will start a screen for each device present on the system. So you have have 3 gpus on the system, you will have three screens that can be addressed by setting the DISPLAY
environment variable to :0.0
, :0.1
and :0.2
.