junhyeok-motech

Results 1 comments of junhyeok-motech

seems like every process involved in FSDP(node count * gpu per node) loads model then this is natural low_cpu_mem_usage = True, can I use this in verl modelloader?