hijkzzz comments

Results 102 comments of


                                            hijkzzz

Is there a way to terminate vllm.LLM and release the GPU memory

mark

[BUG]: SaveCheckpointHook does not save optimizer and scheduler parameters

This medthod does not work with ChatGPT sample + ZeRO2 > Hi @liuslnlp , your point is valid. Can you try [this](https://github.com/hpcaitech/ColossalAI/blob/a020eecc7051083e1dbc4a02bd49a9521b032aad/colossalai/utils/checkpoint/module_checkpoint.py#L9)?

[FEATURE]: Clear Instruction for Checkpoint

Could you share this `zero_to_fp32` script ? > I suggest creating two unified functions to save and load the above parameters. The storage format can follow the example of deepspeed:...

[FEATURE]: Clear Instruction for Checkpoint

Is there an ETA here? > > > > we are designing the new checkpoint io module to support checkpoint saving/loading from various formats, such as single model weights, huggingface...

Forced EOS token in vllm generation?

Because the reward model uses the EOS token to predict the reward value. So we had to hack it.

Forced EOS token in vllm generation?

I am not sure which approach to take at the moment, but our current implementation is heavily dependent on EOS tokens.

enable_ema cause runtime error when running train_ppo_llama.sh

I have tried to fix it. Could you please help me try the latest main branch codes?

enable_ema cause runtime error when running train_ppo_llama.sh

> I try the code , but it seems that `strategy.prepare` still makes parameters to gpu. Maybe ema model should not go through `strategy.prepare`. But we must prepare the ema_model...

enable_ema cause runtime error when running train_ppo_llama.sh

> can we make ema_model only loaded on rank-0 and always on cpu, so there is no need for zero3 This is okay, but you need to modify the code

Citation or comparison to trlX and NeMo-align.

We will compare OpenRLHF with the existing framework in a formal technical report. The biggest weaknesses of NeMo-based solutions are: - Megatron-based generation is very slow, which leads to low...