ppo-pytorch
                                
                                
                                
                                    ppo-pytorch copied to clipboard
                            
                            
                            
                        I found him good for discrete space when I ran the project, but I would like to know how to make use of it in continuous space?
I want to train the agent using the project file after customizing the environment based on the gym's continuous space, the state and actions of the environment are defined as follows:
    self.min_action = np.array([[-3, -3, -3, -3, -3]]).reshape(1,5)
    self.max_action = np.array([[3, 3, 3, 3, 3]]).reshape(1,5)
    self.low_state = np.array(
        [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=np.float32
    ).reshape(1,10)
    self.high_state = np.array(
        [[50, 50, 50, 50, 50, 300, 300, 300, 300, 300]], dtype=np.float32
    ).reshape(1,10)
    self.action_space = spaces.Box(
        low=self.min_action, high=self.max_action, shape=(1, 5), dtype=np.float32
    )
    self.observation_space = spaces.Box(
        low=self.low_state, high=self.high_state, shape=(1, 10), dtype=np.float32
    )
Is it possible to implement this idea based on PPO ICM? Thanks!