RL4LMs icon indicating copy to clipboard operation
RL4LMs copied to clipboard

Top-K and Top-p sampling

Open boblee22 opened this issue 1 year ago • 1 comments

Hi, thanks for your great work!

I have a question about the sampling process. When both top-K and top-p are enabled (e.g., https://github.com/allenai/RL4LMs/blob/main/scripts/training/task_configs/common_gen/t5_nlpo.yml#L44-L51), isn't top-p just ignored because the K most likely next words are filtered and the probability mass is redistributed among only those K next words? Please correct me if my understanding is wrong. Thank you!

boblee22 avatar Oct 19 '22 04:10 boblee22