RepVGG
RepVGG copied to clipboard
fix data type debug
fix batchnorm data type debug
I understand the np.float32 part, thanks for the suggestion, but why is the cuda part deleted? That causes a gpu-cpu mismatch during inference or training (RuntimeError: expected device cpu but got device cuda:0) when you get the equivalent kernel for some reason. id_tensor should be on the same device. The latest version (self.id_tensor = torch.from_numpy(kernel_value).to(branch.weight.device)) looks good and works fine.