ai2thor icon indicating copy to clipboard operation
ai2thor copied to clipboard

Unity exits error after a few hours running of simulator

Open xubo92 opened this issue 3 years ago • 1 comments

Hi @ekolve @Lucaweihs @mattdeitke

Thanks for your great work on this simulator!

I use it for RL training and run it on a remote Ubuntu server without a physical monitor. It works fine with the startx() as virtual display buffer. However, when I run 4 processes at the same time in 4 different terminals, it runs good for a few hours, however, finally, it will crash in some random sense and give the following error:

Unity exit error, please check Player.log, the last action is xxxx and the error happens at the code line related to fifo_server() about receiving messages.

And when I check the Player.log, at the bottom line, I see an error [XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" after 1235 requests (466 known processed) with 0 events remaining].

I also noticed that when everything works good (not crashed yet), from the htop result, it creates far more than 4 subprocesses named /.ai2thor/release/thor-linux64-xxxxxxx.

Anyone has encountered similar questions? Is there any solution for this. This is quite annoying after you run a program for a long time but not finished yet, and everything crashed and you have to start over again.

Thank you! Xubo

xubo92 avatar Feb 24 '22 22:02 xubo92

Could you try to start Xorg using the following script:

https://raw.githubusercontent.com/allenai/ai2thor/main/scripts/ai2thor-xorg

That contains a few fixes that may address what you are seeing. That script will start a screen for each device present on the system. So you have have 3 gpus on the system, you will have three screens that can be addressed by setting the DISPLAY environment variable to :0.0, :0.1 and :0.2.

ekolve avatar Feb 24 '22 22:02 ekolve