carla-roach
carla-roach copied to clipboard
Error with training RL expert
Hi, I'm running run/train_rl.sh
and keep receiving this error
[2022-05-15 08:09:58,133][utils.server_utils][INFO] - Kill Carla Servers!
CarlaUE4-Linux: no process found
[2022-05-15 08:09:59,167][utils.server_utils][INFO] - Kill Carla Servers!
[2022-05-15 08:09:59,168][utils.server_utils][INFO] - CUDA_VISIBLE_DEVICES=0 bash /home/thoaican/carla/CarlaUE4.sh -fps=10 -quality-level=Epic -carla-rpc-port=2000
4.24.3-0+++UE4+Release-4.24 518 0
Disabling core dumps.
Traceback (most recent call last):
File "train_rl.py", line 40, in main
agent = AgentClass('config_agent.yaml')
File "/home/thoaican/carla-roach/agents/rl_birdview/rl_birdview_agent.py", line 15, in __init__
self.setup(path_to_conf_file)
File "/home/thoaican/carla-roach/agents/rl_birdview/rl_birdview_agent.py", line 27, in setup
f = max(all_ckpts, key=lambda x: int(x.name.split('_')[1].split('.')[0]))
ValueError: max() arg is an empty sequence
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
[2022-05-15 08:10:05,478][wandb.sdk.internal.internal][INFO] - Internal process exited
I've browsed the issues page and found the same error from other person here, and the solution is delete the outputs/checkpoint.txt
. But to me it was no help
@thoithoi58, after you delete outputs/checkpoint.txt
, run train_rl.sh. Make sure you let enough training take place that something syncs with wandb. Copy the run_id from outputs/checkpoint.txt
to rl_birdview_agent.py.
Here's my code: https://github.com/neilsambhu/carla-roach/blob/NeilBranch0/timeline/README3.md Go to timestamps (1) 6/19/2022 5:44 PM and (2) 6/20/2022 11:50 AM
Here is my rl_birdview_agent.py: https://github.com/neilsambhu/carla-roach/blob/NeilBranch0/agents/rl_birdview/rl_birdview_agent.py#L107
As of writing this message, I'm still waiting for my initial training to finish to see if my code to read in the run_id from the outputs/checkpoint.txt
works.