Doxie

Results 6 issues of Doxie

**Describe the bug** I'm traning a bloom model in step3 using deepspeed-chat, with offload option turned on, after 14 steps training, it raised the following error(see in logs bleow). I...

bug
deepspeed-chat

bloom默认的padding side是left,为什么在Chinese bloom系列里面,默认的padding side都改成了right?如果我改回left去训练,会对模型造成影响吗? ``` {   "add_prefix_space": false,   "bos_token": "",   "clean_up_tokenization_spaces": false,   "eos_token": "",   "model_max_length": 2048,   "pad_token": "",   "padding_side": "right",   "tokenizer_class": "BloomTokenizer",   "unk_token": ""   } ``` [chinese_bloom_7b_chat_v3](https://huggingface.co/yuanzhoulvpi/chinese_bloom_7b_chat_v3/tree/main)

chinese_bloom

I converted a llama model to nemo, with model dirs like below: ![image](https://github.com/NVIDIA/NeMo-Aligner/assets/6756880/2d36915a-a0ab-4c1a-8d20-0960a7948bdc) When I tried to load it to train a reward model, I got missing keys error. I...

bug
stale

在[reward_trainer.py](https://github.com/OpenLMLab/MOSS-RLHF/blob/main/rm/reward_trainer.py#L147)这里,删除了lm_logits中最后一个token的概率分布,但是在下面的label里面是删除了第一个词,想问下这里是怎么对应的呢 ![image](https://github.com/OpenLMLab/MOSS-RLHF/assets/6756880/fe71ee16-ba46-4797-bce1-3160800976ed)

[ppo_datahelper.py](https://github.com/OpenLMLab/MOSS-RLHF/blob/main/ppo/ppo_datahelper.py#L340)此处代码和对应函数不适配。 ![image](https://github.com/OpenLMLab/MOSS-RLHF/assets/6756880/8ad0f372-20c8-480c-829f-eb026ecff242) 另外想正好咨询一下: 1. 此处应该padding left or right? 2. llama2默认是padding right,但我看到reward model里的batch数据都是padding left,ppo这里都有很多地方也是padding到left的,具体的padding对齐策略是怎样的呢? 3. 我发现loss_mask最终会把对应的tokenid改为0,[ppo_trainer.py](https://github.com/OpenLMLab/MOSS-RLHF/blob/main/ppo/ppo_trainer.py#L464) ,然后和模型输出做cross entropy,这里被mask掉的数据,好像依旧会按照label是0而进行梯度回传,能否咨询下这里的具体原理呢? ![image](https://github.com/OpenLMLab/MOSS-RLHF/assets/6756880/c5b8d922-0bee-4dd5-95bb-589d7f2c3438)

I'm running my program on a gpu cluster with dockers. The default docker image has glibc 2.32 installed and its hard to upgrade it to 2.35. Is there any way...