Can't obtain the experiment results reported in the origin paper
impl.log Attached is the result after I run the code directly. According to the result, in cheetah-dir environment, only one train task worked but did not achieve the result reported in the original article, and another train task completely crashed. I would like to know if there is a problem with my configuration.
''' ..... [2024-10-16 22:12:28,491][main][INFO] - Train step 173400 [2024-10-16 22:12:28,670][main][INFO] - Task 0 reward: 666.2502805933979 [2024-10-16 22:12:28,847][main][INFO] - Task 1 reward: -1249.0995450664027 [2024-10-16 22:12:34,060][main][INFO] - Train step 173500 [2024-10-16 22:12:34,228][main][INFO] - Task 0 reward: 640.641220138112 [2024-10-16 22:12:34,399][main][INFO] - Task 1 reward: -1219.1832915823895 [2024-10-16 22:12:39,579][main][INFO] - Train step 173600 [2024-10-16 22:12:39,744][main][INFO] - Task 0 reward: 651.884470225566 [2024-10-16 22:12:39,908][main][INFO] - Task 1 reward: -1260.7429719279978 [2024-10-16 22:12:45,129][main][INFO] - Train step 173700 [2024-10-16 22:12:45,293][main][INFO] - Task 0 reward: 668.2595137571427 [2024-10-16 22:12:45,457][main][INFO] - Task 1 reward: -1193.5729870485557 [2024-10-16 22:12:50,734][main][INFO] - Train step 173800 [2024-10-16 22:12:50,902][main][INFO] - Task 0 reward: 676.1063361438105 [2024-10-16 22:12:51,072][main][INFO] - Task 1 reward: -1322.9018547352944 '''