Gymnasium-Robotics icon indicating copy to clipboard operation
Gymnasium-Robotics copied to clipboard

[Proposal] Environment should terminate when adroit hand pen drops the pen

Open jjshoots opened this issue 2 years ago • 8 comments

Proposal

In AdroitHandPen, when the agent drops the pen, there is no way to recover, but the environment still does not terminate. The proposal, as in #111, is to enable environment termination on pen drop.

jjshoots avatar Feb 16 '23 12:02 jjshoots

Hi, I was just following the thread and wanted to check if the condition to check if the pen is dropped needs to be restored which was removed in #111 ?

            # penalty for dropping the pen
            if obj_pos[2] < 0.075:
                reward -= 5
            # removed code
               terminated = True
 

leonasting avatar Jan 07 '24 23:01 leonasting

Hey, @leonasting basically yes, but also needs to be tested would you be interested on testing it and writing a PR, with a short report that shows terminal frames?

@jjshoots what testing had you done?

Thanks!

Kallinteris-Andreas avatar Jan 08 '24 07:01 Kallinteris-Andreas

This was awhile ago and I don't quite remember, but if I recall correctly, the AdroitHand environments are no-termination environments, negative rewards are incurred in perpetuity (or until the truncation). So adding a termination signal to HandPen specifically doesn't make sense. At least that's as much discussion on this as I can remember.

jjshoots avatar Jan 08 '24 14:01 jjshoots

I'm interested in testing. Let me know what tests, you want me to perform. Based on the code and environment, I can infer any agent action after the pen is out of hands is redundant. In the meantime, I will capture few screenshots of terminal frames with the earlier code.

leonasting avatar Jan 08 '24 18:01 leonasting

Initially, the pen has a z-coordinate of 0.25 on the hand and the forehand has a value of 0.2. During experiments involving random movements, the z-coordinate of the pen stays between 0.2 and 0.25 while grasped. If dropped, it falls below 0.2 until hitting the table at around 0.8. adroit_pen adroit_pen_2

leonasting avatar Jan 10 '24 01:01 leonasting

@leonasting it not is very clear with this camera angle, try camera_id=3 (argument in the make constructor)

Kallinteris-Andreas avatar Jan 12 '24 14:01 Kallinteris-Andreas

I have attached screenshot of the terminal state. adroit_pen_3 Another screenshot of the pen out of the hand. adroit_pen_4

leonasting avatar Jan 12 '24 22:01 leonasting

@leonasting excellent I think it is clear that

  1. below 0.8 the pen has fallen, and the hand can not interact with it in any way

Now can you show 2. there is no benefit to keep training after the pen has fallen

A simple ablation study should do it

Kallinteris-Andreas avatar Jan 12 '24 23:01 Kallinteris-Andreas