Xiao Huang

Results 4 comments of Xiao Huang

We trained SO2 on the halfcheetah-random-v2 dataset offline for 3M steps, achieving an average return of 3000. However, after online fine-tuning for 100K steps, the average return only reached around...

I used version 0.5.0 of DI-engine, but I still get an error in the create_policy function. `TypeError: __init__() got an unexpected keyword argument 'ensemble_num'`

I used version 0.5.1 of DI-engine and version 0.24.0 of gym, but I still can only achieve the performance I previously described and cannot replicate the results from the paper.