BehaviorMetrics icon indicating copy to clipboard operation
BehaviorMetrics copied to clipboard

DDPG update with continuous action spaces

Open UtkarshMishra04 opened this issue 4 years ago • 13 comments

fix #140

Changes:

  1. Added test environments with continuous action spaces
  2. Added training support for continuous action environments
  3. Added DDPG support (currently TensorFlow 1.15, to be migrated to TensorFlow 2.4 (Keras) and repeated with Pytorch)

The pipeline is fully working, however, there is a bit of hardcoding in some segments. These will be progressively changed.

Current Work:

  • [ ] Add the final results video link
  • [ ] Remove the hard-coded segments
  • [x] Add Tensorflow 2.4 and Pytorch support
  • [x] Add CUDA support
  • [x] Tensorboard Support

Status: Files open for review. PR is not yet suitable for merge!

UtkarshMishra04 avatar Apr 08 '21 20:04 UtkarshMishra04

Hi Utkarsh!

Have you been able to continue working on this PR, so I can review it? You can add those points you mention as yet to be completed as a list:

  • [ ] task 1
  • [ ] task 2

so I can easily see what's completed and what's not.

sergiopaniego avatar Apr 15 '21 08:04 sergiopaniego

We target keras/tensorflow 2 as base versions not to have several versions working for the same project. Do you think it's suitable to update the version here?

sergiopaniego avatar Apr 15 '21 08:04 sergiopaniego

Hi Sergio

I made the changes as a task list. I saw that all the DL-RL codes are currently built in tf-v2.4 and my implementation is in tf-v1.15 (My bad!). I have converted them already and I am preparing some training results.

Still hard-coded though! I am wrapping up my current semester actually, and this eventually decreased my pace. However I will get back with the updates and let you know :)

Thanks

UtkarshMishra04 avatar Apr 15 '21 10:04 UtkarshMishra04

Hi @sergiopaniego

Issue #159 is correct and it is because of the incompatibility of Keras with python 3.7. Based on resources:

As of now (May 2019), Keras is NOT compatible with Python 3.7. Therefore, you must downgrade your python version to 3.6. You can update when Keras support Python 3.7 on the following websites: Official Keras documentation.

What do you suggest about this? This is important as we will not be able to use Keras. I have to switch back to vanilla TensorFlow in that case. However, even then, a lot of tf2 modules are with tf.keras.

Well, this might be a strong motivation to shift to PyTorch now!!

UtkarshMishra04 avatar Apr 19 '21 13:04 UtkarshMishra04

Hi! Good to know about that issue.

I guess your source is: https://stackoverflow.com/questions/55431953/trouble-installing-keras-with-python-3-7-3#:~:text=As%20of%20now%20(May%202019,Official%20Keras%20documentation The directly reference pointed in that post is the issue (https://github.com/keras-team/keras/issues/11690). It's closed. The conversation in that thread suggest that python3.7 is not tested (don't know if it's currently tested since the thread is from some time ago) but that doesn't mean that I won't work. It could have some issues, that's correct, but it could work.

What do you think about it?

sergiopaniego avatar Apr 20 '21 17:04 sergiopaniego

Hi!

Yes, definitely, you are right. I am copy-pasting codes for the functions that are giving errors, but the interesting thing is if you change the version pair for Tensorflow and Keras, you will start getting different errors. Currently, I am working with tf2.4.1 and Keras 2.4. Please let me know if you have any updates on this issue. We have to make Keras work with python 3.7 to efficiently progress with further developments.

Thanks a lot!

PS: This PR is now flexible to any continuous control RL algorithm. The environments are working fine!

UtkarshMishra04 avatar Apr 20 '21 18:04 UtkarshMishra04

Do you have a video showing the new functionality? 😄

Please update the PR since the repo has been renamed from Behavior Studio to Behavior Metrics. I can help you with it if you encounter any problem

sergiopaniego avatar Apr 22 '21 07:04 sergiopaniego

Hi Sergio

I am still resolving the issues with Keras, however, the environments can be used with any RL algorithm. Also, I updated this PR with Behavior Metrics changes.

I will keep you posted on the progress!

Thanks again!

UtkarshMishra04 avatar Apr 24 '21 00:04 UtkarshMishra04

Reviewing the code, when trying running a non-RL based brain it doesn't work due to the new Action config variable.

To reproduce the error just run the default.yml file python3 driver.py -c default.yml -g

In which cases do we need this config variable? review it please! :D

sergiopaniego avatar Apr 28 '21 10:04 sergiopaniego

Hi

The action-type is to deal with "discrete" and "continuous" action spaces for the RL environments. I suggest adding Action: None after Type: 'f1' will resolve this issue.

However, now I see another issue:

20: GazeboEnv: launching gzserver. 20: Executing app 40: No module named 'ui.gui.resources.resources' 20: closing all processes... 20: Pilot: pilot killed. 20: DONE! Bye, bye :)

Can you let me know if you receive this as well? If not, then most probably due to some local changes this is coming up.

UtkarshMishra04 avatar Apr 28 '21 15:04 UtkarshMishra04

Hi again

I have resolved the issue with actions in config files. No change is required. Further, I have also added the laser environments with continuous actions. There was a minor issue with Qlearn Laser which is resolved now.

Finally with the LASER environments I get another error:

Traceback (most recent call last): File "driver.py", line 206, in main() File "driver.py", line 181, in main pilot = Pilot(app_configuration, controller, app_configuration.brain_path) File "/root/BehaviorStudio/behavior_metrics/pilot.py", line 67, in init self.initialize_robot() File "/root/BehaviorStudio/behavior_metrics/pilot.py", line 98, in initialize_robot self.brains = Brains(self.sensors, self.actuators, self.brain_path, self.controller) File "/root/BehaviorStudio/behavior_metrics/brains/brains_handler.py", line 23, in init self.load_brain(brain_path) File "/root/BehaviorStudio/behavior_metrics/brains/brains_handler.py", line 39, in load_brain exec(open(self.brain_path).read()) File "", line 84, in File "/usr/local/lib/python3.7/dist-packages/gym/wrappers/monitor.py", line 38, in reset observation = self.env.reset(**kwargs) File "/root/BehaviorStudio/gym-gazebo/gym_gazebo/envs/f1/env_gazebo_f1_ddpg_laser.py", line 214, in reset laser_data = rospy.wait_for_message('/F1ROS/laser/scan', LaserScan, timeout=5) File "/opt/ros/noetic/lib/python3/dist-packages/rospy/client.py", line 419, in wait_for_message raise rospy.exceptions.ROSException("timeout exceeded while waiting for message on topic %s"%topic) rospy.exceptions.ROSException: timeout exceeded while waiting for message on topic /F1ROS/laser/scan

Is there any change in the laser scan topic?

UtkarshMishra04 avatar Apr 28 '21 22:04 UtkarshMishra04

Hi @sergiopaniego

I have added PyTorch DDPG support and it's completely working, except I require more computational power to complete the training. There are further changes to be implemented after training.

Please review the code once and let me know, I have verified the DDPG implementation with Half-Cheetah!

UtkarshMishra04 avatar May 03 '21 19:05 UtkarshMishra04

PyTorch coming to the project!! Great improvement! Seeing the requirements.txt updated with torch is good news 😄

Could you give me details of what's already implemented and how to reproduce?

Thanks!!!

sergiopaniego avatar May 06 '21 16:05 sergiopaniego