MerlinWang
MerlinWang
辛苦问下具体预训练方法,我用自己的数据预训练,但是获取不到相关的records,搞不懂为啥create_pretrain_data.py里面有这个split(',') 
Hello, Just re-test the BERTNLU-RuleDST-GDPL-TemplatedNLG policy. 
I am afraid that the evaluation results of PPO maybe not correct. There are two levels of evaluation. The first one is policy/evaluate.py which is action-level, it follows the instructions...
这个不对吧,tag.TXT里面已经存在start 和 eos,不需要再次进行 + 2 操作
I am also confused, what should I do if I just wanna GAIL loss? just reward = - (1 - s).log()
I found that my output is mostly "\ \", do not know why.
# packages in environment # # Name Version Build Channel _libgcc_mutex 0.1 main backports-weakref 1.0rc1 pypi_0 pypi bilm 0.1.post5 pypi_0 pypi bleach 1.5.0 pypi_0 pypi ca-certificates 2019.10.16 0 certifi 2018.8.24...
Hi, did u already get the dataset? I will be very grateful if u can send me a copy of it.
Hi, thank you for your incredible work. Here is our new EMNLP 2023 paper about LLM evaluation for in-depth dialogue questions. Feel free to add it to your survey!! Cue-CoT:...