Stephen Roller comments

Results 117 comments of


                                            Stephen Roller

Question: Running generation with batches

If you have time (I don't immediately), can you trace through with model parallel and non-model parallel and see where things diverge?

Question: Running generation with batches

Bump to keep this open

display_examples doesn't work with dynamic_batching

This is on the roadmap for H1 2021

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

Yeah a couple things are messing with you: - eval_model "forgets" the batch size. So all your things are running with a batchsize of 1. Based on "gpu_mem", it looks...

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

(Also `--beam-delay` only does anything if `--inference delayedbeam` is set)

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

ALSO, I have this WIP PR that's very close but just needs some testing: https://github.com/facebookresearch/ParlAI/pull/2775. TLDR is that eval_model only uses one GPU, and the new PR fixes this.

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

Closing this, but lemme know if you have further questions Sam. Cheers!

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

No, it will always use only 1 gpu for evaluation. If you use that PR, then you can use `multiprocessing_eval` with otherwise identical arguments and it will split the data...

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

Ah, a few tricks to speed up training: - There is `parlai.scripts.multiprocessing_train`. It behaves just like I described `multiprocessing_eval` above. Simply switch from calling `python -m parlai.scripts.train_model` to `python -m...

How long should eval_model.py -t blended_skill_talk -m zoo/blender_90 take?

Oh, and `--eval-batchsize` is also an option, to pump up the batchsize during validation, since you don't need activations/gradients.