yiy

Results 5 issues of

yiy

Shouldn't the output of the critic be generated for each individual time step, rather than averaged over the entire sequence?

1

comment

https://github.com/hpcaitech/ColossalAI/blob/29386a54e66d7e5ca40cabf1686839fba9aac71d/applications/ChatGPT/chatgpt/models/base/critic.py#L46

For small models, actor, critic, init-actor, and reward can be loaded on a single machine. However, how to build the PPO process for LLM?

3

comment

For small models, actor, critic, init-actor, and reward can be loaded on a single machine. However, how to build the PPO process for LLM?

where can we get a bloomz-7b1 finetuned checkpoint

we want to continue fine tuning a bloomz-7b1 model, where can we get the model checkpoints like 176B :

在pretrain阶段，处理wudao语料后，用2000切割，后续又切割为500。在get_input_data.py中构建pretrain的source_tokens、target_tokens。这两个会跨越多个 document 吗

5

comment

你好，请教个问题。在pretrain阶段，处理wudao语料后，用2000切割，后续又切割为500。在get_input_data.py中构建pretrain的source_tokens、target_tokens。这两个会跨越多个 document 吗

How long will it be before support for DeepSeek R1 is available?