Sam Shleifer comments

Results 45 comments of


                                            Sam Shleifer

README.md recommendations: weight and dropc

In your MT experiment, what do you use for weights? Here is my loss fn (modified for/from https://github.com/huggingface/transformers/blob/master/examples/seq2seq/finetune.py#L151 ) ```python lm_logits = outputs[0] # shape bs, seq_len, vocab_size assert lm_logits.shape[-1]...

README.md recommendations: weight and dropc

After 1 hr, baseline has BLEU 21.57 (finetuning mbart on wmt-en-ro) `dropper(dropc=0.3)` has BLEU of 13.5, which seems to me like a bug.

README.md recommendations: weight and dropc

I have run 4 12h experiments now, starting with your step 1. The key results are: - `nn.CrossEntropyLoss` (my original loss fn) works better both with and without `dropper` than...

README.md recommendations: weight and dropc

Did not understand it was a second stage, my bad! I tried it on an english-romanian translator trained on a superset of the wmt english-romanian dataset in another repo (MarianMT)....

ROUGE-1.5.5.pl - XML::Parser dependency error

is there a mac equivalent?

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

Awesome. That solves it! This command runs in 20 seconds: ```bash python parlai/scripts/eval_model.py -t blended_skill_talk \ -mf zoo:blender/blender_90M/model --metrics ppl --batchsize 32 --skip-generation true ``` Excited for that PR.

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

Is `--gpu -1` the correct option for multigpu? When I run `--gpu -1 -bs 24` on an 8 GPU setup only 1 GPU gets used. When I run `--gpu -1...

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

I meant during training, sorry for being horribly unclear. To rephrase, I am trying to (a) train on multiple GPUs (b) every val step (I know about `--validation_every_n_epochs`) run validation...

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

Would be super useful. The reason I couldn't figure this out was that ```bash python examples/train_model.py --help | grep val python examples/train_model.py --help | grep gpu ``` don't return the...

Python 2? Really?

More specifically, trying `text/scripts/prepro.sh` with py3 results in TypeError: can't concat str to bytes from `load_vocab`. Are you guys considering updating to py3?