Junbo Li
Junbo Li
https://github.com/Yujun-Shi/FedCLS/blob/master/main.py#L142 Does not initialize 'argument_path' if 'args.log_file_name' is specified.
We established the environment and preprocessed the data as per the provided instructions. However, while executing the command ```bash scripts/runs/run_pile_baseline120M.sh```, we noticed a sudden reduction in speed after loading specific...
I have only one spare GPU in one node. How to specify it? CUDA_VISIBLE_DEVICES does not work.
Does this support PPO with step-level PRM? Currently I only see scripts for PPO with token-level RM. Specifically, how can we train PPO with [OpenRLHF/Mistral-7b-PRM-Math-Shepherd](https://huggingface.co/OpenRLHF/Mistral-7b-PRM-Math-Shepherd)? Are there train codes and...
Are the SFT codes here trained on the whole chat or just the response (completions)?