pytorch_bn_fusion
pytorch_bn_fusion copied to clipboard
Batch normalization fusion for PyTorch
I load mobileNet v2 and operate it by fuse_bn_recursively function, then print the network strutures of this two model, but I found that the bn_fusion net is the same as...
Weights in the `Conv2d` layer are stored as tensors with shape `(out_channels, in_channels, kernel_size[0], kernel_size[1])`, while weights in the `ConvTranspose2d` are stored as tensors with shape `(in_channels, out_channels, kernel_size[0], kernel_size[1])`.
I have benchmarked with resnet50, resnet101, the bn_fusion performance improves with CPU (about 7%), but no improvement with cuda. There is no noticeably difference between `torch.cuda.cudnn.benchmark` true and false. My...