micronet
micronet copied to clipboard
使用多卡训练时的bug
self.scale = torch.max(self.scale, self.eps) # processing for very small scale
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! 在训练IOA时,使用多卡训练会报错。请问有人有遇到吗?
IAO请先用单卡
难受了,单卡存不下
把 --eval_batch_size 调小
如果是iao,我在output = (torch.clamp(self.round(input / self.scale - self.zero_point), self.quant_min_val, self.quant_max_val) + self.zero_point) * self.scale
的前面添加了
if self.scale.device != input.device: self.scale = self.scale.to(input.device) if self.zero_point.device != input.device: self.zero_point = self.zero_point.to(input.device) if self.quant_min_val.device != input.device: self.quant_min_val = self.quant_min_val.to(input.device) if self.quant_max_val.device != input.device: self.quant_max_val = self.quant_max_val.to(input.device)
可以用多卡了,反正就是直接用input.device,不要用observer.device