coremltools icon indicating copy to clipboard operation
coremltools copied to clipboard

Allow variable (computed) weights in convolution

Open praeclarum opened this issue 4 years ago • 18 comments

Description

I would like to be able to use Conv2d and Conv2dTranspose with variable weights. Currently, I get this error:

Input 'weight' of op 'Gs_1/G_synthesis/8x8/Conv0_up/conv2d_transpose' (conv_transpose) must be const at compile time.

when trying to convert StyleGAN 2.

Use cases

This is needed in modern GANs where classes and other embeddings are used to change the statistics of the weights of convolution. In StyleGAN 2, this is used to implement "Weight Demodulation".

The revised architecture enables us to replace instance normalization with a “demodulation” operation, which we apply to the weights associated with each convolution layer.

Analyzing and Improving the Image Quality of StyleGAN

Screen Shot 2020-08-13 at 5 32 41 PM

They removed instance normalization in favor of this technique. (Instance normalization was causing quality issues.)

Describe alternatives you've considered

  • Using alternative networks since this is a critical feature of this

praeclarum avatar Aug 14 '20 00:08 praeclarum

Have you found a solution?

sailor002 avatar Jan 02 '21 14:01 sailor002

Also struggling with this, converting StyleGAN2 and variants from PyTorch to CoreML has had roadblock after roadblock and this is one of them we're seeing as well.

HorusAlkebulan avatar Apr 29 '21 21:04 HorusAlkebulan

@praeclarum have you solved it?

vogoriachko avatar Oct 21 '21 08:10 vogoriachko

I'm also getting this error with the latest coremltools version 5.1 when trying to convert the generator network from stylegan2-ada-pytorch

ValueError: ('Op "x.23" (op_type: conv_transpose) Input weight must be const at compile time', 'weight', 'w.35')

kuprel avatar Nov 13 '21 13:11 kuprel

I think I've narrowed down the problem. The error "Input weight must be const" only occurs for conv_transpose layers and not conv layers. Looking at the MIL ops documentation,conv_transpose requires the weight argument to be constant and conv does not [1]. I think changing the implementation of conv_transpose to accept a non-constant weight argument like conv does would solve the problem here.

[1] https://apple.github.io/coremltools/source/coremltools.converters.mil.mil.ops.defs.html#module-coremltools.converters.mil.mil.ops.defs.conv

kuprel avatar Nov 15 '21 14:11 kuprel

Hello,

Any news on this feature request? @kuprel did you manage to find a way to "rewrite" conv_transpose so that it can be used in coreml ? thanks

john7002 avatar Oct 24 '22 13:10 john7002

This worked for me. Still uses conv_transpose but with constant data

import torch
from torch.nn import functional

def conv_transpose_stride2(x: torch.Tensor, w: torch.Tensor) -> torch.Tensor:
    dilate = torch.nn.ConvTranspose2d(in_channels=128, out_channels=128, kernel_size=1, stride=2, groups=128, bias=False)
    dilate.weight.data = torch.ones([128, 1, 1, 1])
    pad = torch.nn.ZeroPad2d([1, 1, 1, 1])
    return functional.conv2d(dilate(pad(x)), w.transpose(0, 1).flip(2, 3))

if __name__ == "__main__":
    torch.manual_seed(0)
    x = torch.randn([1, 128, 256, 256])
    w = torch.randn([128, 64, 3, 3])
    y = functional.conv_transpose2d(x, w, stride=2)
    y_ = conv_transpose_stride2(x, w)
    size = torch.tensor(y.shape).prod()
    with torch.no_grad():
        print((y - y_).square().mean().numpy(), y_.square().mean().numpy(), y.square().mean().numpy())

kuprel avatar Oct 24 '22 14:10 kuprel

Awesome! thanks

john7002 avatar Oct 24 '22 15:10 john7002

Awesome! thanks

john7002 avatar Oct 24 '22 15:10 john7002

Hi @kuprel,

I tried your implementation @kuprel but I get TracerWarning: Trace had nondeterministic nodes. error:

# After your code.
import coremltools as ct
traced_model = torch.jit.trace(conv_transpose_stride2, x)

This gives the following error:

/opt/homebrew/lib/python3.9/site-packages/torch/jit/_trace.py:828: TracerWarning: Trace had nondeterministic nodes. Did you forget call .eval() on your model? Nodes:
	%60 : Float(128, 1, 1, 1, strides=[1, 1, 1, 1], requires_grad=1, device=cpu) = aten::uniform_(%tensor, %57, %58, %59) # /opt/homebrew/lib/python3.9/site-packages/torch/nn/init.py:412:0
This may cause errors in trace checking. To disable trace checking, pass check_trace=False to torch.jit.trace()
  _check_trace(
/opt/homebrew/lib/python3.9/site-packages/torch/jit/_trace.py:828: TracerWarning: Output nr 1. of the traced function does not match the corresponding output of the Python function. Detailed error:
Tensor-likes are not close!

Mismatched elements: 16842766 / 16842816 (100.0%)
Greatest absolute difference: 134.11310195922852 at index (0, 23, 324, 138) (up to 1e-05 allowed)
Greatest relative difference: 15173386.579398053 at index (0, 14, 291, 143) (up to 1e-05 allowed)
  _check_trace(

Did you get this error? Did you solve it?

Best, Rahul Bhalley

RahulBhalley avatar Jan 12 '23 15:01 RahulBhalley

Can someone share a minimal example (i.e. a toy network which fails conversion because of variable weight convolution)?

TobyRoseman avatar Jan 12 '23 21:01 TobyRoseman

Following is a code snippet to reproduce the error @TobyRoseman.

Now come on Apple (@TobyRoseman), please help me with this issue https://github.com/apple/coremltools/issues/1723#issuecomment-1381527231, please! I want StyleGAN2 on my iPhone!! We developers can't wait longer. This is extremely important, DL research is moving too fast & CoreML is kind of lacking behind (another case is the FFT ops).

import torch
from torch import nn
from torch.nn import functional as F

class Model(nn.Module):
  def __init__(self) -> None:
    super().__init__()
  def forward(self, x, w):
    return F.conv_transpose2d(x, w, padding=0, stride=2, groups=1)

x = torch.randn(1, 128, 128, 128)
w = torch.randn([128, 64, 3, 3])

model = Model().eval()
traced_model = torch.jit.trace(model, (x, w))

import coremltools as ct

inputs = [ct.TensorType('x', x.shape), ct.TensorType('w', w.shape)]
mlmodel = ct.convert(traced_model, inputs=inputs)

Then BOOM! You get an error:

ValueError: ('Op "22" (op_type: conv_transpose) Input weight must be const at compile time', 'weight', 'w')

RahulBhalley avatar Jan 13 '23 14:01 RahulBhalley

Is there any update regarding this? I am facing the same problem I am not able to get rid of this error:

Tuple detected at graph output. This will be flattened in the converted model.
Converting PyTorch Frontend ==> MIL Ops:  84%|██████████████████████████████▏     | 518/619 [00:00<00:00, 3731.02 ops/s]
Error during Core ML conversion: ('Op "706" (op_type: conv_transpose) Input weight must be const at compile time', 'weight', 'wi_center')

Could you suggest something please @TobyRoseman

Here is my implementation:

import torch
import coremltools as ct
import torch.nn.functional as F

pretrained = "pretrained/states_pt_places2.pth"
generator_state_dict = torch.load(pretrained, map_location=torch.device('cpu'))['G']

if 'stage1.conv1.conv.weight' in generator_state_dict.keys():
    from model.networks import Generator
else:
    from model.networks_tf import Generator  

# Set up the network
generator = Generator(cnum_in=5, cnum=48, return_flow=False)
generator.load_state_dict(generator_state_dict, strict=True)


img = torch.rand([1, 5, 512, 512]).cpu()  # set image shape to 1024x1024
mask = torch.rand([1, 1, 512, 512]).cpu()

generator.cpu().eval()

# Use JIT to compile the PyTorch model to TorchScript
example_inputs = (torch.rand(1, 5, 512, 512), torch.rand(1, 1, 512, 512))


traced_model = torch.jit.trace(generator, example_inputs)
# Create the Core ML input and output types
input_type = ct.ImageType(name="input", shape=img.shape, color_layout="RGB")
mask_type = ct.TensorType(name="mask", shape=mask.shape)
output_type = ct.ImageType(name="output", color_layout="RGB")

# Convert the TorchScript model to Core ML
try:
    # Convert the TorchScript model to Core ML
    coreml_model = ct.convert(
        traced_model,
        inputs=[input_type, mask_type],
        outputs=[output_type],
        debug=True
    )

    # Save the Core ML model to a file
    coreml_model.save("output.mlmodel")

    print(f'Successfully exported Core ML model')
    
except Exception as e:
    print(f"Error during Core ML conversion: {e}")


I am trying to migrate this model: https://github.com/nipponjo/deepfillv2-pytorch

cc: @RahulBhalley @kuprel

Thanks in advance

ConradoMateu avatar May 08 '23 19:05 ConradoMateu

The issue here is that the conv_transpose MIL op requires its weight parameter be a constant tensor. This is not something which can be fix in the Coremltools repository. This would require a change to the Core ML Framework.

Please submit this Core ML Framework issue using the Feedback Assistant. Once you have done that please the id value you get. The id value should start with "FB" followed by seven digits.

TobyRoseman avatar May 08 '23 22:05 TobyRoseman

I don't understand. Why can't CoreML Tools team ask CoreML team to fix this issue? No diss, but I think the current team probably doesn't take this work seriously (this issue is 3 years old). It's really affecting us! Please understand that marketing and sales are really hard enough for us indie developers. At least make this development a breeze for us.

RahulBhalley avatar May 09 '23 15:05 RahulBhalley

Any updates or not fixing this??? @TobyRoseman

RahulBhalley avatar Dec 08 '23 17:12 RahulBhalley

@RahulBhalley - did you submit this issue via the Feedback Assistant as I suggested? If so, do you have a Feedback ID?

TobyRoseman avatar Dec 08 '23 19:12 TobyRoseman

Any updates?

ykk648 avatar Jan 12 '24 15:01 ykk648