Zoisaang
Zoisaang
> I have found a link to resnet50-25c4b509.zip( https://github.com/LikeLy-Journey/SegmenTron/releases/download/v0.1.0/resnet50-25c4b509.pth). and test it successfully!! thank you very much
I use vicuna7b and EVA01-CLIP-g-14 , the initial loss is 6.8479. After 1.0M sampls, the loss is 2.6, is it normal?
I met the same problem
所以请问有sft版本的时间线吗
> 在收录数据时会尽量避免不重复收录. 但是不同来源数据间的去重并不是这个语料集需要考虑的工作. 本语料集对标的是chatGPT训练使用的40T数据,这份包括了网页数据的40T数据也是没有做内部去重的. 请教一下chatGPT使用了40T数据,这一信息是从哪里来获得的呢?
I have the same issue and I want to describe it in more detail. Here are the settings for GRPO in the example codes: data.train_batch_size=1024 actor_rollout_ref.actor.ppo_mini_batch_size=256 actor_rollout_ref.actor.ppo_micro_batch_size_per_gpu=80 actor_rollout_ref.rollout.log_prob_micro_batch_size_per_gpu=160 actor_rollout_ref.ref.log_prob_micro_batch_size_per_gpu=160 actor_rollout_ref.rollout.n=5...
你好,请问这个问题解决了吗?