HollrayChan

Results 15 comments of HollrayChan

@BenSpex Thank you for your reply. I then adjusted the learning rate and used a larger batch in multi-level multi-card. At first, the loss can drop as quickly as the...

Thank you very much for your reply. Regarding the global_crop_size, I will modify the source code accordingly to ensure that they can support non-square inputs; I will try the hyperparameters...

> @HollrayChan Hi, I'm experiencing similar issues where the loss doesn't converge. I'm wondering if you've managed to resolve your problems. Could you please share any updates or results? thanks!...

> 我用vllm 0.8.2 + 2节点跑的时候也遇到这个问题,经人提醒,将vllm的v1引擎关掉之后,一切正常了。 具体操作是,设置export VLLM_USE_V1=0,并且训练参数里面不要使用设置 actor_rollout_ref.rollout.enforce_eager=False actor_rollout_ref.rollout.free_cache_engine=False 就可以了 我这边4卡24g的,只设置actor_rollout_ref.rollout.enforce_eager=False 和actor_rollout_ref.rollout.free_cache_engine=False 就有效

Let me describe a few approaches I've tried. I initially used Dinov1 for pre-training, then fine-tuned all parameters. When the data size reached 1kw, the following patterns emerged, and these...