Sam Shleifer
Sam Shleifer
I saw your paper at ACL and want to test it out in my MT/Summarization training (code):[https://github.com/huggingface/transformers/blob/master/examples/seq2seq/finetune.py] What should I pass as `weight` to `nn.NLLLoss` and what is the recommended...
Previously, each sequence was padded to the length of the longest sequence in the *dataset*. In this PR, each *batch* is padded to the length of the longest sequence in...
Is there a way to temporarily disable slack sender during unittests? I want to add knock knock to my training code but don't want it to slack me every time...
When I run: ```bash python parlai/scripts/eval_model.py -t blended_skill_talk \ -mf zoo:blender/blender_90M/model --metrics ppl ``` I get output: ``` 14:57:01 INFO | 0.5% complete (30 / 5,651), 0:00:10 elapsed, 0:32:10 eta...
Similar to #474, I want to restrict my vocabulary, and then save a new model file that uses the restricted vocabulary. I tried to do this by saving a vocabulary,...
I have been struggling for a few hours to install on OSX and was wondering whether you guys have any tips. Cmake seems to terminate successfully, but then make -j4...
I was inspecting intermediate values of the output tensor `transformer.h`, while running `marian_decoder`, and noticed that the first step through the decoder some sort of token is passed that has...
The shift-tab auto-completion display for **some** functions breaks their signature into entries of one argument per line that make them much harder to read. It is the same if you...
On the ber-es [transformer](https://object.pouta.csc.fi/OPUS-MT-models/ber-es/opus-2020-01-15.test.txt), if I run: ```bash spm_encode --model source.spm