VL_adapter A question about zero-grad settings in VL-adapter's multitask.py file.

A question about zero-grad settings in VL-adapter's multitask.py file.

Open y2sman opened this issue 8 months ago • 0 comments

Thanks for your brilliant work.

                batch['log_train_accuracy'] = self.args.log_train_accuracy

                # self.optim.zero_grad()
                if self.args.fp16 and _use_native_amp:
                    with autocast():
                        if self.args.distributed:
                            results = self.model.module.train_step(batch)
                        else:
                            results = self.model.train_step(batch)
                else:
                    if self.args.distributed:
                        results = self.model.module.train_step(batch)
                    else:
                        results = self.model.train_step(batch)

                loss = results['loss']

Looking at the code, it appears that you are training without initializing the gradients before performing backpropagation.

Is there a reason why this works?

May 27 '24 00:05 y2sman

VL_adapter VL_adapter copied to clipboard

A question about zero-grad settings in VL-adapter's multitask.py file.

VL_adapter
VL_adapter copied to clipboard