RLSeq2Seq icon indicating copy to clipboard operation
RLSeq2Seq copied to clipboard

Deep Reinforcement Learning For Sequence to Sequence Models

Results 14 RLSeq2Seq issues
Sort by recently updated
recently updated
newest added

Bumps [tensorflow-gpu](https://github.com/tensorflow/tensorflow) from 1.10 to 2.7.2. Release notes Sourced from tensorflow-gpu's releases. TensorFlow 2.7.2 Release 2.7.2 This releases introduces several vulnerability fixes: Fixes a code injection in saved_model_cli (CVE-2022-29216) Fixes...

dependencies

hello, when I decode using eval model, something wrong, could you help me? the main information is: Traceback (most recent call last): File "run_summarization.py", line 845, in tf.app.run() File "/home/ices/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py",...

While running the Actor-Crtic Experiment, "the Pre-Training Critic with fixed Actor", the program stops expectedly after saying the replay buffer isnt loaded enough yet. The error code is actually this:...

i have tried your project for rl-learning, however,it's result is not so good. so, can you offer a pretrained model of rl-learning here?

WARNING: Logging before flag parsing goes to stderr. W0624 19:29:10.819931 140309927061376 deprecation_wrapper.py:118] From src/run_summarization.py:795: The name tf.app.run is deprecated. Please use tf.compat.v1.app.run instead. W0624 19:29:10.821898 140309927061376 deprecation_wrapper.py:118] From src/run_summarization.py:661: The...

Hello! I'm running into a reshaping error when using RL and intermediate rewards. The output of `intermediate_rewards()` is a `# list of max_dec_step * (batch_size, k)`(line 241) and then this...

My batch_size is 64, I pretrain my model for about 50000 iterations, and get a better result than pgen`s. Then I turn on the coverage mechanism, and train the model...

Hi @yaserkl , Thanks for publishing the code! In line no. 525 of the `attention_decoder.py` file, `embedding_lookup` throws an error when there are OOV (greater than vocab size) id's in...

there is an error from an update on model.py return [self.calc_reward(t, _ss,_gts) for t in range(1,self._hps.max_dec_steps+1)] the def calc_reward(self, _ss, _gts) only take 2 arguments maybe it is return [self.calc_reward(_ss,_gts)...

Hi Authors, Thanks for the excellent work! For this research direction, I think you maybe interested in work on using RL in practical applications where user ratings or preferences are...