sequicity icon indicating copy to clipboard operation
sequicity copied to clipboard

About Reinforcement Learning

Open gmftbyGMFTBY opened this issue 5 years ago • 0 comments

First of all, thanks for your open-source code of this wonderful work. I also have some questions about your code of reinforcement learning. I found that in your version of reinforcement learning, you use the training dataset for policy gradient to fine-tuning parameters. But actually, in my opinion, a user simulator should be used as the environment for updating the parameters in RL setup. Can you tell me the reason? Thank you very much !

gmftbyGMFTBY avatar Mar 20 '19 07:03 gmftbyGMFTBY