Jintao Lin issues

Repositories
Issues
Comments

Results 11 issues of


                                            Jintao Lin

About VILADistributedSampler and gradient_accumulation_steps

If we use the VILADistributedSampler (https://github.com/Efficient-Large-Model/VILA/blob/main/llava/train/llava_trainer.py#L274-L281) for Distributed Training, should the `gradient_accumulation_steps` be hardcoded to 1? Since I notice that when I use 4 nodes (8 GPUs per node) to...