mmgeneration
mmgeneration copied to clipboard
Padding at convolution module in SinGANMSGeneratorPE's generator while using explicit positional encoding (e.g. CSG, SPE)
Describe the issue
According to "Positional Encoding as Spatial Inductive Bias in GANs", zero padding leads to an unbalanced spatial bias with vague relation between locations. Thorough out the paper, they propose other explicit positional encoding such as cartesian grid or sinusoidal positional encodings. While using these explicit positional encoding, they remove padding from convolution generators and replace them with bilinear upsampling.
However, according to configuration in this project (e.g. https://github.com/open-mmlab/mmgeneration/blob/master/configs/positional_encoding_in_gans/singan_csg_bohemian.py,), I found out that these implementation use padding with size 1 at convolution module. I wonder that this is an exact reimplementation of paper.
- What config dir you run?
_base_ = ['../singan/singan_fish.py']
num_scales = 10 # start from zero
model = dict(
type='PESinGAN',
generator=dict(
type='SinGANMSGeneratorPE',
num_scales=num_scales,
padding=1,
pad_at_head=False,
first_stage_in_channels=2,
positional_encoding=dict(type='CSG')),
discriminator=dict(num_scales=num_scales))
train_cfg = dict(first_fixed_noises_ch=2)
data = dict(
train=dict(
img_path='./data/singan/bohemian.png',
min_size=25,
max_size=500,
))
dist_params = dict(backend='nccl', port=28120)
total_iters = 22000
- Did you make any modifications on the code or config? Did you understand what you have modified?
No I haven't change anything.
- What dataset did you use?
I used ballons.png which provided in original singan repository.
Environment
sys.platform: linux Python: 3.8.10 (default, Sep 28 2021, 16:10:42) [GCC 9.3.0] CUDA available: True CUDA_HOME: /usr/local/cuda NVCC: Build cuda_11.1.TC455_06.29190527_0 GPU 0,1,2,3,4,5: NVIDIA RTX A6000 GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 PyTorch: 1.8.1+cu111 PyTorch compiling details: PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.1
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
- CuDNN 8.0.5
- Magma 2.5.2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
TorchVision: 0.9.1+cu111 OpenCV: 4.2.0 MMCV: 1.4.0 MMGen: 0.4.0+ac1c630 MMCV Compiler: GCC 9.3 MMCV CUDA Compiler: 11.1
Results
Currently i don't have any results when removing padding in convolution module. I will reproduce result as soon as possible.