OFA icon indicating copy to clipboard operation
OFA copied to clipboard

question about how to use topp sampling?

Open zwkkk opened this issue 2 years ago • 1 comments

when trying task gigaword, i have the bug below:

UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). unfin_idx = bbsz_idx // beam_size ../aten/src/ATen/native/cuda/MultinomialKernel.cu:214: sampleMultinomialOnce: block: [4,0,0], thread: [0,0,0] Assertion sum > accZero failed.

my code: python3 -m torch.distributed.launch --nproc_per_node=${GPUS_PER_NODE} --master_port=${MASTER_PORT} ../../evaluate.py
${data}
--path=${path}
--user-dir=${user_dir}
--bpe=bert
--task=gigaword
--batch-size=16
--log-format=simple --log-interval=10
--seed=7
--gen-subset=${split}
--results-path=${result_path}
--sampling
--sampling-topk 10
--sampling-topp 0.7
--beam=6
--lenpen=0.7
--max-len-b=32
--no-repeat-ngram-size=3
--fp16
--num-workers=0
--model-overrides="{"data":"${data}","bpe_dir":"${bpe_dir}","selected_cols":"${selected_cols}"}"

zwkkk avatar Feb 22 '23 08:02 zwkkk

For what reason you consider about using topp sampling? For this repo, we do not have relevant experience. Perhaps it is still better to use beam search following our practice to get a good result.

JustinLin610 avatar Mar 15 '23 06:03 JustinLin610