coremltools icon indicating copy to clipboard operation
coremltools copied to clipboard

How to pass multiple inputs contain integer in pytorch.jit.trace and coreml convert?

Open bigmindapp opened this issue 3 years ago • 13 comments

the model like below:
def forward(self, masked_frames, num_local_frames:int):
        l_t = num_local_frames
        b, t, ori_c, ori_h, ori_w = masked_frames.size()
        # normalization before feeding into the flow completion module
        masked_local_frames = (masked_frames[:, :l_t, ...] + 1) / 2
        pred_flows = self.forward_bidirect_flow(masked_local_frames)

if i write jit like this:
traced_model = torch.jit.trace(model, (masked_imgs,len(neighbor_ids)) )
            torch.jit.save(traced_model,"traced_model.pth")
it will be error.

and if the convert python below is right?
model_input1=ct.TensorType(name="masked_frames",shape=( 1, 12, 3, 240, 432 )  ) 
model_input2=ct.TensorType(name="num_local_frames",shape=(1,)  )  
model2 = ct.convert(
    model,
    inputs=[model_input1,model_input2], 
    source="pytorch", 
    minimum_deployment_target=ct.target.iOS13,
)

bigmindapp avatar Jun 10 '22 06:06 bigmindapp

By the way, if I loaded with a pretrained model, would i need to run pytorch.jit.trace before ct.convert? Confused

bigmindapp avatar Jun 10 '22 07:06 bigmindapp

now i changed the input to tensor like below:

def forward(self, masked_frames, num_local_frames):
        l_t = num_local_frames.item()
        b, t, ori_c, ori_h, ori_w = masked_frames.size()
        # normalization before feeding into the flow completion module
        masked_local_frames = (masked_frames[:, :l_t, ...] + 1) / 2
        pred_flows = self.forward_bidirect_flow(masked_local_frames)

ids=torch.tensor(len(neighbor_ids))
            traced_model = torch.jit.trace(model, (masked_imgs,ids) )
            #torch.save(traced_model,"traced_model2.pt")
            traced_model.save("traced_model2.pt")

But when i Load the model, it with error, model=torch.jit.load("traced_model2.pt")

cpp_module = torch._C.import_ir_module(cu, f, map_location, _extra_files) RuntimeError: [enforce fail at inline_container.cc:208] . file not found: traced_model2/constants.pkl

or i load model like this, it with another error, model=torch.load("traced_model2.pt") erialization.py", line 852, in _load result = unpickler.load() ModuleNotFoundError: No module named 'torch'

Pytorch = 1.5.1

bigmindapp avatar Jun 10 '22 12:06 bigmindapp

I'm a little confused here. I'm sure I understand all your questions.

If you're getting the error: ModuleNotFoundError: No module named 'torch', this is a problem with your environment. torch is not installed in the environment you are actually using. If you installed torch, you installed it in a different environment than the one you are using.

If possible, it's recommend that you call torch.jit.trace on your model prior to conversion. Our support for scripted PyTorch model (i.e. non-traced model) is experimental.

It looks like you commented out the line torch.save(traced_model,"traced_model2.pt"). So that is probably why it can't find traced_model2.pt.

I think that address all of your questions. If you have additional questions, please state those questions as clearly as possible.

TobyRoseman avatar Jun 10 '22 23:06 TobyRoseman

I'm a little confused here. I'm sure I understand all your questions.

If you're getting the error: ModuleNotFoundError: No module named 'torch', this is a problem with your environment. torch is not installed in the environment you are actually using. If you installed torch, you installed it in a different environment than the one you are using.

If possible, it's recommend that you call torch.jit.trace on your model prior to conversion. Our support for scripted PyTorch model (i.e. non-traced model) is experimental.

It looks like you commented out the line torch.save(traced_model,"traced_model2.pt"). So that is probably why it can't find traced_model2.pt.

I think that address all of your questions. If you have additional questions, please state those questions as clearly as possible.

Hi TobyRoseman, Thanks for your reply.

I finally load the pth and run torch.jit.trace in the original model file, without a new file. So i don't need to save the jit model.

#    net = importlib.import_module('model.' + args.model)
#    model = net.InpaintGenerator().to(device)
#    data = torch.load(args.ckpt, map_location=device)
#    model.load_state_dict(data)
    model=torch.load("e2fgvi_hq.pth")
    print(f'Loading model from: {args.ckpt}')
    model.eval()
ids=torch.tensor(len(neighbor_ids))
           traced_model = torch.jit.trace(model, (masked_imgs,ids) )
           #torch.save(traced_model,"traced_model2.pt")
           #traced_model.save("traced_model2.pt")
           #exit()
           model_input1=ct.TensorType(name="masked_frames",shape=( 1, 12, 3, 240, 432 )  )
           model_input2=ct.TensorType(name="num_local_frames",shape=ids.shape)
           print(model_input2.shape.shape)
           model_output=ct.TensorType(name="output",shape=( 1, 12, 3, 240, 432 )  )
           print("#####"+str(ct.SPECIFICATION_VERSION))
           model2 = ct.convert(
                               traced_model,
                               inputs=[model_input1,model_input2],
                               source="pytorch",
                               minimum_deployment_target=ct.target.iOS13,
                               )
           #model2.version="2.0"
           spec=model2.get_spec()
           spec.specificationVersion=ct._SPECIFICATION_VERSION_IOS_13
           model2.save("netnew.mlmodel")

but there are many errors with different coremltools version. one of that is, conda install pytorch==1.5.1 torchvision==0.6.1 torchaudio==0.7.2 cudatoolkit -c pytorch + coremltools=4.1

File "/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 55, in convert_nodes
    "PyTorch convert function for op '{}' not implemented.".format(node.kind)
RuntimeError: PyTorch convert function for op 'new_zeros' not implemented.

bigmindapp avatar Jun 11 '22 14:06 bigmindapp

conda install pytorch==1.5.1 torchvision==0.6.1 torchaudio==0.7.2 cudatoolkit -c pytorch + coremltools=5.1

File "/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/coremltools/converters/mil/mil/builder.py", line 198, in placeholder
    return Placeholder(shape, dtype)
  File "/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/coremltools/converters/mil/mil/program.py", line 92, in __init__
    raise ValueError('Rank-0 (input {}) is unsupported'.format(name))

conda install pytorch==1.5.1 torchvision==0.6.1 torchaudio==0.7.2 cudatoolkit -c pytorch + coremltools=6.0b1

 File "/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/coremltools/converters/mil/frontend/torch/load.py", line 45, in load
    if type(torchscript) == _torch.jit._script.RecursiveScriptModule:
AttributeError: module 'torch.jit' has no attribute '_script'

bigmindapp avatar Jun 11 '22 15:06 bigmindapp

Then i upgrade the pytorch to 1.7.1 conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit -c pytorch. With coremltools from 4.1 to 6.0b2 There are still many errors. Headache

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit -c pytorch+ coremltools=4.1

 File "/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 55, in convert_nodes
    "PyTorch convert function for op '{}' not implemented.".format(node.kind)
RuntimeError: PyTorch convert function for op 'new_zeros' not implemented.

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit -c pytorch + coremltools=5.1

 return Placeholder(shape, dtype)
  File "/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/coremltools/converters/mil/mil/program.py", line 92, in __init__
    raise ValueError('Rank-0 (input {}) is unsupported'.format(name))
ValueError: Rank-0 (input None) is unsupported

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit -c pytorch + coremltools=6.0b1File ``` "/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/coremltools/converters/mil/mil/program.py", line 104, in init raise ValueError('Rank-0 (input {}) is unsupported'.format(name)) ValueError: Rank-0 (input None) is unsupported

bigmindapp avatar Jun 11 '22 15:06 bigmindapp

I'm a little confused here. I'm sure I understand all your questions.

If you're getting the error: ModuleNotFoundError: No module named 'torch', this is a problem with your environment. torch is not installed in the environment you are actually using. If you installed torch, you installed it in a different environment than the one you are using.

If possible, it's recommend that you call torch.jit.trace on your model prior to conversion. Our support for scripted PyTorch model (i.e. non-traced model) is experimental.

It looks like you commented out the line torch.save(traced_model,"traced_model2.pt"). So that is probably why it can't find traced_model2.pt.

I think that address all of your questions. If you have additional questions, please state those questions as clearly as possible.

It was an error when I save the jit model which i ignored before. So the "traced_model2.pt" not normally generate contants.pkl file. So that it can't be loaded correctly. My problem now, there is a third part function built in .so file with mmcv library called ModulatedDeformConv2dFunction. It seems a little difficult to convert those functions. How to get rid of problems like this?

<module 'mmcv._ext' from '/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/mmcv/_ext.cpython-36m-darwin.so'>
  0%|                                                    | 0/14 [03:56<?, ?it/s]
Traceback (most recent call last):
  File "test6.py", line 370, in <module>
    main_worker()
  File "test6.py", line 294, in main_worker
    traced_model.save("traced_model3.pt")
  File "/Users/mac/opt/anaconda3/envs/e2fgvi36/lib/python3.6/site-packages/torch/jit/_script.py", line 487, in save
    return self._c.save(*args, **kwargs)
RuntimeError: 
Could not export Python function call 'ModulatedDeformConv2dFunction'. Remove calls to Python functions before export. Did you forget to add @script or @script_method annotation? If this is a nn.ModuleList, add it to __constants__:

bigmindapp avatar Jun 13 '22 16:06 bigmindapp

the ModulatedDeformConv2dFunction is subclass of torch.autograd.Function. I found there are many issues about exporting torch.autograd.Function failed. It might be a problem which should be solved.

bigmindapp avatar Jun 13 '22 17:06 bigmindapp

Support for the new_zeros PyTorch layer type was added in our latest beta. To install this beta, run:

pip install coremltools==6.0b1

I've never seen the other PyTorch related errors before. What version of PyTorch are you using?

Can you give us a self contained minimal example to reproduce the rank-0 issue?

TobyRoseman avatar Jun 13 '22 18:06 TobyRoseman

the ModulatedDeformConv2dFunction is subclass of torch.autograd.Function. I found there are many issues about exporting torch.autograd.Function failed. It might be a problem which should be solved.

Thanks TobyRoseman, Finally I found the problem is at the ModulatedDeformConv2dFunction. ModulatedDeformConv2dFunction is subclass of torch.autograd.Function. I found there are many issues while exporting torch.autograd.Function failed.

Here can reproduce the issue:

git clone https://github.com/MCG-NKU/E2FGVI.git
conda create -n e2fgvi python=3.6
conda activate e2fgvi

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit -c pytorch
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu92/torch1.7/index.html
conda install tensorboard matplotlib scikit-image==0.16.2
pip install tqdm 

download the pth Google Drive to E2FGVI/release_model/

add lines to the File "/Users/mac/github/E2FGVI/test.py", line 123 == >

jit_model = torch.jit.script(model)
    #torch.save(jit_model,"traced_model2.pt")
    jit_model.save("traced_model3.pt")
    exit()

change the File "/Users/mac/github/E2FGVI/model/e2fgvi_hq.py", line 101 == >

def forward(self, x):
        bt, c, _, _ = x.size()
        # h, w = h//4, w//4
        out = x
        x0 = out
        _, _, h, w = x0.size()
        for i, layer in enumerate(self.layers):
            if i == 8:
                x0 = out
                _, _, h, w = x0.size()
            if i > 8 and i % 2 == 0:
                g = self.group[(i - 8) // 2]
                x = x0.view(bt, g, -1, h, w)
                o = out.view(bt, g, -1, h, w)
                out = torch.cat([x, o], 2).view(bt, -1, h, w)
            out = layer(out)
        return out
 

File "/Users/mac/github/E2FGVI/model/e2fgvi_hq.py", line 129 ==> scale_factor=2 ==> scale_factor=2.0 File "/Users/mac/github/E2FGVI/model/modules/feat_prop.py", line 132 ==>

                featx = []
                for k in feats:
                    if k not in ['spatial', module_name]:
                        featx[k] = [
                            feats[k][idx]
                            
                        ]
                feat = [feat_current] + featx + [feat_prop]

File "/Users/mac/github/E2FGVI/model/modules/feat_prop.py", line 150 ==>

outputs = []
        for i in range(0, t):
            align_feats = []
            for k in feats:
                if k != 'spatial':
                    align_feats[k] = [feats[k].pop(0)]
            align_feats = torch.cat(align_feats, dim=1)
            outputs.append(self.fusion(align_feats))

Finally it stop at below:

Python builtin <built-in method apply of FunctionMeta object at 0x7fe65b029ac8> is currently not supported in Torchscript:
  File "/Users/mac/github/E2FGVI/model/modules/feat_prop.py", line 55
        mask = torch.sigmoid(mask)

        return modulated_deform_conv2d(x, offset, mask, self.weight, self.bias,
               ~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
                                       self.stride, self.padding,
                                       self.dilation, self.groups,

the modulated_deform_conv2d function in mmcv, which is wrote like this:

from torch.autograd import Function
class ModulatedDeformConv2dFunction(Function):
...
modulated_deform_conv2d = ModulatedDeformConv2dFunction.apply

So, now problem become how to export torch.autograd function with torch.jit.script or torch.jit.trace. And then we can convert it with coremltools. I found there is no very clear answer after searching. Do you have the opinion? Thanks so much indeed.

bigmindapp avatar Jun 14 '22 08:06 bigmindapp

Hi @bigmindapp - This issue has been a bit chaotic. What exactly is the current issue? Can you share a stack trace and error message?

Also, in order to help we really need a minimal example that reproduces the problem. Can you share a small amount of self contained code that reproduces the problem?

TobyRoseman avatar Jun 14 '22 21:06 TobyRoseman

Is the issue is solved? two input rather than one input is nessessary... please support it! T.T

NeighborhoodCoding avatar Aug 12 '22 06:08 NeighborhoodCoding

@NeighborhoodCoding - The problem here is not clear. Can you give us a minimal example to reproduce the issue?

TobyRoseman avatar Aug 12 '22 23:08 TobyRoseman

Since we have not heard back here, I'm going to close the issue.

TobyRoseman avatar Nov 09 '22 21:11 TobyRoseman