RLSeq2Seq
RLSeq2Seq copied to clipboard
Deep Reinforcement Learning For Sequence to Sequence Models
Bumps [tensorflow-gpu](https://github.com/tensorflow/tensorflow) from 1.10 to 2.7.2. Release notes Sourced from tensorflow-gpu's releases. TensorFlow 2.7.2 Release 2.7.2 This releases introduces several vulnerability fixes: Fixes a code injection in saved_model_cli (CVE-2022-29216) Fixes...
hello, when I decode using eval model, something wrong, could you help me? the main information is: Traceback (most recent call last): File "run_summarization.py", line 845, in tf.app.run() File "/home/ices/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py",...
While running the Actor-Crtic Experiment, "the Pre-Training Critic with fixed Actor", the program stops expectedly after saying the replay buffer isnt loaded enough yet. The error code is actually this:...
i have tried your project for rl-learning, however,it's result is not so good. so, can you offer a pretrained model of rl-learning here?
WARNING: Logging before flag parsing goes to stderr. W0624 19:29:10.819931 140309927061376 deprecation_wrapper.py:118] From src/run_summarization.py:795: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead. W0624 19:29:10.821898 140309927061376 deprecation_wrapper.py:118] From src/run_summarization.py:661: The...
Hello! I'm running into a reshaping error when using RL and intermediate rewards. The output of `intermediate_rewards()` is a `# list of max_dec_step * (batch_size, k)`(line 241) and then this...
My batch_size is 64, I pretrain my model for about 50000 iterations, and get a better result than pgen`s. Then I turn on the coverage mechanism, and train the model...
Hi @yaserkl , Thanks for publishing the code! In line no. 525 of the `attention_decoder.py` file, `embedding_lookup` throws an error when there are OOV (greater than vocab size) id's in...
there is an error from an update on model.py return [self.calc_reward(t, _ss,_gts) for t in range(1,self._hps.max_dec_steps+1)] the def calc_reward(self, _ss, _gts) only take 2 arguments maybe it is return [self.calc_reward(_ss,_gts)...
Hi Authors, Thanks for the excellent work! For this research direction, I think you maybe interested in work on using RL in practical applications where user ratings or preferences are...