learning_from_play
learning_from_play copied to clipboard
Change actions to be relative & in end-effector space (not joint space)
We hypothesis that relative actions will be easier to learn since the model does not have to learn or account for the DC component of the signal. In the literature it's often observed that normalisation and rescaling of inputs greatly helps with training. From some experiments it seems like relative is noticeably quicker to train (to equivalent loss) and seems to perform better on validation data.
Secondly, we want to combine this with learning actions in cartesian end-effector space (rather than the current robot joint space) as Sholto reckons this gives smoother actions and is less prone to observation noise.
As a side note, relative quaternions seem to be computable via the following: