DexterousHands
DexterousHands copied to clipboard
Segmentation fault (core dumped) in Docker
Segmentation fault (core dumped) in Docker
Device: NVIDIA A100 40GB PCIe GPU Accelerator
Method: Docker
Details:
I run
python train.py --task=ShadowHandOver --algo=ppo
and
python train.py --task=ShadowHandOver --algo=happo
in ~\bi-dexhands
In both task the model weights xxx.pt
had been saved in ~\bi-dexhands\logs
correctly.
However, at the end of these tasks, it shows error in console as following.
Output:
some episodes done, average rewards: tensor(16.7454, device='cuda:0')
some episodes done, average rewards: tensor(14.1145, device='cuda:0')
some episodes done, average rewards: tensor(15.4696, device='cuda:0')
some episodes done, average rewards: tensor(15.4252, device='cuda:0')
some episodes done, average rewards: tensor(14.8325, device='cuda:0')
some episodes done, average rewards: tensor(19.7192, device='cuda:0')
some episodes done, average rewards: tensor(15.9727, device='cuda:0')
Algo happo Exp check updates 48825/48828 episodes, total num timesteps 49997824/50000000, FPS 1922.
some episodes done, average rewards: tensor(14.0804, device='cuda:0')
some episodes done, average rewards: tensor(17.5084, device='cuda:0')
some episodes done, average rewards: tensor(18.6891, device='cuda:0')
Segmentation fault (core dumped)
Is there any suggestion about dealing with this error?
Thx in advance!
Dear @RogerLZX ,
I'm sorry that because we rarely use docker to run Isaac Gym, I don't know the reason for this bug. It looks like this bug only appears at the end of the task, so maybe you can increase the number of episodes to achieve the same effect.
Isaac Gym is still in development, so there will inevitably be many of these bugs. I recommend that you can go to the DevTalk Forum to find or ask about this bug, usually there will be NVIDIA developers to answer the questions if they know.
Hope this can help you.
@cypypccpy Sorry~ This issue is duplicated with issue #8 by some mistakes. Please delete it.