carla icon indicating copy to clipboard operation
carla copied to clipboard

terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast

Open nuomizai opened this issue 5 years ago • 18 comments

This error happened on CARLA server when I use leaderboard and scenario runner to create my A3C training environment. Strangely, it appeared a few hours after the start of training. Does anyone know how to solve that?

nuomizai avatar Nov 04 '20 00:11 nuomizai

Same Issue occurring exactly as described! I am using 0.9.10 Have you found a solution to this yet?

yasser-h-khalil avatar Nov 08 '20 22:11 yasser-h-khalil

Same Issue occurring exactly as described! I am using 0.9.10 Have you found a solution to this yet?

Sorry, @yasser-h-khalil . I haven't found the reason and the solution. I used leaderboard and scenario runner. What't is your setting?

nuomizai avatar Nov 09 '20 12:11 nuomizai

This is the statement I use to launch the server: DISPLAY= ./CarlaUE4.sh -opengl -carla-port=2000. I am using RTX5000 with 410.48 driver It works for hours and then crashes with the following error:

terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_castterminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast
terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast

Signal 6 caught.
Signal 6 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554 
Signal 6 caught.
CommonUnixCrashHandler: Signal=6
Malloc Size=65535 LargeMemoryPoolOffset=131119 
Malloc Size=123824 LargeMemoryPoolOffset=254960 
Engine crash handling finished; re-raising signal 6 for the default handler. Good bye.
Aborted (core dumped)

yasser-h-khalil avatar Nov 09 '20 12:11 yasser-h-khalil

This is the statement I use to launch the server: DISPLAY= ./CarlaUE4.sh -opengl -carla-port=2000. I am using RTX5000 with 410.48 driver It works for hours and then crashes with the following error:

terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_castterminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast
terminating with uncaught exception of type clmdep_msgpack::v1::type_error: std::bad_cast

Signal 6 caught.
Signal 6 caught.
Malloc Size=65538 LargeMemoryPoolOffset=65554 
Signal 6 caught.
CommonUnixCrashHandler: Signal=6
Malloc Size=65535 LargeMemoryPoolOffset=131119 
Malloc Size=123824 LargeMemoryPoolOffset=254960 
Engine crash handling finished; re-raising signal 6 for the default handler. Good bye.
Aborted (core dumped)

The error is exactly the same with what I met! I used an old version of leaderboard and scenario runner to train my DRL agent in a distributed manner. I used CARLA 0.9.9.3 by the way. Now I use the latest version of leaderboard and scenario runner from leaderboard and scenario runner and CARLA 0.9.10. I will tell you if that works as soon as the training process finished. Hope this will help you if you have the same setting with me!

nuomizai avatar Nov 09 '20 13:11 nuomizai

Hello @nuomizai, are you using Traffic Manager?

yasser-h-khalil avatar Nov 12 '20 22:11 yasser-h-khalil

Hello @nuomizai, are you using Traffic Manager?

Hey @yasser-h-khalil , sorry for the delay. Yes, I'm using Traffic Manager. Actually, after I used the lastest version of leaderboard and scenario runner, this error gone. Have you figured out the reason for this error?

nuomizai avatar Nov 16 '20 11:11 nuomizai

No, I am still facing this issue.

yasser-h-khalil avatar Nov 16 '20 19:11 yasser-h-khalil

@glopezdiest could you follow up on this please?

corkyw10 avatar Feb 26 '21 14:02 corkyw10

I met the same question, have you solve it

raozhongyu avatar Apr 24 '21 11:04 raozhongyu

Hey, this issue is probably related to this other one, which is a memory leak issue at the LB. We do know that it exists but we haven't found the problem yet

glopezdiest avatar Apr 26 '21 07:04 glopezdiest

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 21 '21 03:07 stale[bot]

met the same question + 1

deepcs233 avatar Aug 21 '21 12:08 deepcs233

Observed this in the CARLA 0.9.12 container in Ubuntu 18.04 with a consumer Kepler GPU, seems random

qhaas avatar Oct 20 '21 12:10 qhaas

i met the same question, is there any solution?

grablerm avatar Jan 07 '22 12:01 grablerm

Me too!!!!!

Kin-Zhang avatar Feb 03 '22 14:02 Kin-Zhang

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 16 '22 12:04 stale[bot]

met the same question + 1 is there any solution?

jhih-ching-yeh avatar Jun 20 '22 18:06 jhih-ching-yeh

Met the same issue. CARLA 0.9.10, RTX 3090, Ubuntu 20.04.

hlchen1043 avatar Jul 07 '22 19:07 hlchen1043

Same here. CARLA 0.9.13, RTX 3080, Ubuntu 20.04.

buesma avatar Jan 24 '23 10:01 buesma

I also encountered a situation where I would loop through scenarios in my code, which I believe is a serious bug in CARLA. CARLA 0.9.10 RTX8000, Ubuntu18.04, Python 3.7

AtongWang avatar Apr 26 '23 06:04 AtongWang

Same question here after 1000+ rounds RL training, which i believe is traffic manager error. Any suggestions? Signal 11 caught. Malloc Size=65538 LargeMemoryPoolOffset=65554 CommonUnixCrashHandler: Signal=11 Malloc Size=131160 LargeMemoryPoolOffset=196744 Malloc Size=131160 LargeMemoryPoolOffset=327928 Engine crash handling finished; re-raising signal 11 for the default handler. Good bye. Segmentation fault (core dumped) Any clue why CARLA crashed.

Device Info: GPU: NVIDIA Titan RTX 24G RAM: 64G CPU: i9 9900X

Ubuntu: 20.04.5 CUDA: 11.7 NVIDIA Driver Version: 525.89.02

Unkn0wnH4ck3r avatar May 02 '23 05:05 Unkn0wnH4ck3r

I got into the same situation when I tried to train my own RL agent for over 150 epochs. I also used some memory profilers tools, like memory-profiler python mudule and psutil python module, but the memory usage is not growing. So it shouldn't be the problem of memory leaks. Are there any better solutions? Tested on two machines

  • machine 1: Ubuntu20.04 GPU: NVIDIA RTX Quadro 6000 Nvidia driver: 535.129.03
  • machine 2: Ubuntu20.04 GPU: NVIDIA RTX 4070 Nvidia driver: 535.104.05 Snipaste_2023-12-04_09-31-39

CurryChen77 avatar Dec 05 '23 08:12 CurryChen77

the same question , the different is I'm just running the example file , when I run manual_control.py , the UE just crashed and error came.

CMakey avatar Dec 28 '23 13:12 CMakey