RuoqueLi
Results
1
issues of
RuoqueLi
I don't think this code can solve the problem(pendulum), and another question is why this reward is 'running_reward * 0.9 + score * 0.1'