Zhiting Hu

http://www.cs.cmu.edu/~zhitingh/ [email protected]

Pittsburgh Carnegie Mellon University

Results 30 comments of


                                            Zhiting Hu

Beam Search Is very Slow in Transformer

The transformer beam-search is adapted from the official implementation ([tensor2tensor](https://github.com/tensorflow/tensor2tensor)). Not sure how it can speed up. A possible way would be using a more efficient variant of transformer decoder...

Using reinforce_loss

To "produce a tensor with shape [bs, sl]" from `logits` and `sample_id`, you may use [`sequence_sparse_softmax_cross_entropy`](https://texar.readthedocs.io/en/latest/code/losses.html#texar.tf.losses.sequence_sparse_softmax_cross_entropy) and set ``` average_across_batch=False, average_across_timesteps=False, sum_over_batch=False, sum_over_timesteps=False ``` -- Another way of doing RL...

Using reinforce_loss

The code looks good. A reference code here (which is basically the same as what you wrote): https://github.com/asyml/texar/issues/147#issuecomment-489442414 2- it's not really necessary cuz you'd do the mask with `reduce_with_weights`

Using reinforce_loss

I couldn't see the why here. What's in the `fetches` here? ``` File "roc_rl_main_refacored.py", line 724, in _train_epoch rets = sess.run(fetches, feed_dict, options=run_opts) ``` If optimization (e.g,, `train_op`) is included:...

Using reinforce_loss

running `train_op` (in `fetches`) will consume GPU memory for gradient tensors. A quick test is to remove `train_op` from `fetches` and see if OOM is gone. If so, it means...

Using reinforce_loss

Removing `train_op` or using `tf.stop_gradient` is for debugging purpose -- to locate which portion of the code causes OOM. Once it's located and fixed, you do need to add back...

Using reinforce_loss

hmm... The OOM is caused by the optimization (backward pass). Gradients of `rl_loss_fine` and `loss_mle` should consume the same amount of memory, respectively. To verify this -- since you've tried...

Using reinforce_loss

> @ZhitingHu I really appreciate your help. > Yeah, that is a good test and actually I tried with just `loss==rl_loss_fine` and it threw the same error. Note that, I...

Using reinforce_loss

Glad to hear that! :) Could you briefly explain the cause of OOM, for future reference? Thanks

In Text-Style-Transfer Example, how do you use the model on a chosen sentence?

> Hi, > I tried running the code for the text style transfer example after reading the related paper(Unsupervised Text Style Transfer using Language Models as Discriminators) and I have...

1
2
3
›