DeepMimic
DeepMimic copied to clipboard
how to run with Multi-Clip Rewards ?
I would like to Train the agent capable to turn left, right, go forward. as in the published video https://youtu.be/vppFvq2quQ0?t=332...
Please let me know how to do it.
and is it possible to train agent without mocap data? if I yes how?
Thanks @xbpeng
@xbpeng expecting some help from you !!
Sorry about the late response. We haven't included the code and data for the multi-clip reward. But it shouldn't be very difficult to implement given the existing framework.
- get clips of different walking and turning motions
- when computing the reward function in cSceneImitate::CalcReward, compute the reward with respect to every motion, and take the max reward
this framework only supports training policies to imitate mocap. But you can change the reward function to not look at the mocap and instead computes the reward using some other objective.
@xbpeng Thank you for ur comment .. I will try it
Sorry about the late response. We haven't included the code and data for the multi-clip reward. But it shouldn't be very difficult to implement given the existing framework.
1. get clips of different walking and turning motions 2. when computing the reward function in cSceneImitate::CalcReward, compute the reward with respect to every motion, and take the max reward
sorry i understood the second point but can you please explain me how to get clips of different turning motions ?? @xbpeng
if any sample implementation is there it would of great help Thanks
You can find some free mocap clips online. For example http://mocap.cs.cmu.edu/. You'll need to convert the file formats to the format in data/motions/ before using it.
@xbpeng so how you convert the file formats to the format in deepmimic? Since I noticed the coordinates in cmu dataset are somewhat different from that in deepmimic?
@firefox1031 Have a look at this repository.
Sorry about the late response. We haven't included the code and data for the multi-clip reward. But it shouldn't be very difficult to implement given the existing framework.
1. get clips of different walking and turning motions 2. when computing the reward function in cSceneImitate::CalcReward, compute the reward with respect to every motion, and take the max reward
@xbpeng when you calculate the reward with respect to every motion, is the simulated character imitating always a single motion clip during the whole training process?
if you have 4 motion clips (0, 1, 2, 3), does the simulated character always try to imitate the clip 0 and you calculate the reward based on 0, 1, 2 and 3 or do you alternate them somehow during training - e.g., one iteration with clip 0 (and calculate reward), one iteration with clip 1 (and calculate reward) and so on?