Question about Converting Sander-wood (a Seq2Seq model) to Coreml

Open maxW2000 opened this issue 10 months ago • 1 comments

❓Question

If this is a question about the Core ML Frame work or Xcode, please ask your question in the Apple Developer Forum: https://developer.apple.com/forums/

I am currently trying to convert Sander-wood to coreml. BUT I met the problem when infer using mlpackage model. Here is the code I used to convert. My question is, when I convert it using flexible inputs (actually should use it) and use this model to infer, always told me even if I generate the inputs exactly the same as the shape of model required.
NSLocalizedDescription = "Unable to compute the prediction using a neural network model. It can be an invalid input data or broken/unsupported model (error code: -7).";
After that I try to fix the shape of inputs when convert. It works. Does anyone know why this happened, or just because I input the wrong shape? though, I don't think so. I also paste the input shape and flexible shape that model accepted below.

class FullModelWrapper(torch.nn.Module):
    def __init__(self, model):
        super().__init__()
        self.model = model

    def forward(self, input_ids, decoder_input_ids):
       
        outputs = self.model(
            input_ids=input_ids,
            decoder_input_ids=decoder_input_ids,
            return_dict=False 
        )
        return outputs[0] 
    
def convert_whole_sanderwood_coreml(model_path):
    tokenizer = AutoTokenizer.from_pretrained('/Users/maxfr/all-python-project/on-device-musc-generator/original')
    model = AutoModelForSeq2SeqLM.from_pretrained('/Users/maxfr/all-python-project/on-device-musc-generator/original', 
                                                  return_dict=False,
                                                  torchscript=True)
    
    model.eval()  
    config = model.config


    text = "This is a traditional Irish dance music."
    inputs = tokenizer(text, return_tensors='pt', truncation=True, max_length=1024)
    input_ids = inputs['input_ids'].to(torch.int32)
  
    decoder_input_ids = torch.full((1, 10), model.config.decoder_start_token_id, dtype=torch.int32)  # 形状 (1,10)
    #decoder_input_ids = torch.tensor([[config.decoder_start_token_id]], dtype=torch.int32)
  
    
    with torch.no_grad():
        traced_model = torch.jit.trace(FullModelWrapper(model).eval(), 
                                    (input_ids,decoder_input_ids),
                                    check_trace=True,
                                    strict=False)
        
   
    
    """wrapped_model = FullModelWrapper(model)
    

    dynamic_shapes = {
        "input_ids": {1: torch.export.Dim("seq_len", min=1, max=1024)},
        "decoder_input_ids": {1: torch.export.Dim("seq_len", min=1, max=1024)}
    }
    

    exported_model = torch.export.export(
        wrapped_model,
        (input_ids, decoder_input_ids),
        dynamic_shapes=dynamic_shapes
    )"""
    input_shape = ct.Shape(shape=(1,
                                  ct.RangeDim(lower_bound=1, upper_bound=1024, default=10)
                                ))
    
    coreml_model = ct.convert(
        traced_model,
        inputs=[
            ct.TensorType(
                name="input_ids",
                shape=input_shape,  
                dtype=np.int32
            ),
            ct.TensorType(
                name="decoder_input_ids",
                shape=input_shape, 
                dtype=np.int32
            ),
        ],
        outputs=[
            ct.TensorType(
                name="logits",
            )
        ],
        minimum_deployment_target=ct.target.iOS16,
        compute_units=ct.ComputeUnit.CPU_ONLY,
        convert_to="mlprogram",
    )

    coreml_model.save(model_path)
    print(f"Model successfully converted and saved to {model_path}")
    ```
the shape model accepted 
  default  shape: [1, 10]
  range: ['[1, 1]', '[1, 1024]']
  default shape: [1, 10]
  range: ['[1, 1]', '[1, 1024]']

the shape of my inputs
inputs input_ids: shape (1, 10), type int32
inputs decoder_input_ids: shape (1, 10), type int32

Mar 05 '25 17:03 maxW2000

@maxW2000 the error indicates that there is a shape mismatch. The model that you are exporting has fixed input shapes, could you please try converting with flexible inputs https://apple.github.io/coremltools/docs-guides/source/flexible-inputs.html#set-the-range-for-each-dimension

Mar 06 '25 04:03 cymbalrush