Mixed double precision for PPO_RNN algorithm

Open lopatovsky opened this issue 1 year ago • 0 comments

Mixed precision

Motivation:

Inspired by RLGames, we implemented automatic mixed double precision to boost performance of PPO_RNN especially for big models.

Sources:

https://pytorch.org/docs/stable/amp.html

Speed eval:

model with one layer of lstm (hidden size: 768, seq_len 128) followed by mlp units: [2048, 1024, 1024, 512]
trained with isaac-sim simulation (so the speed up on skrl side is actually higher than what this test shows)

Mixed-Precision	Time (s)	Speed Factor
No	155	1x
Yes	105	0.677x

Quality eval:

We trained a policy for our task with each of the configurations multiple times. We didn’t observe any statistically significant difference in quality of the final results.

Jul 15 '24 12:07 lopatovsky