exporters icon indicating copy to clipboard operation
exporters copied to clipboard

M2M100 Example?

Open fakerybakery opened this issue 2 years ago • 8 comments

Hello, I'm trying to convert M2M100 to CoreML. I saw that it is partially supported, and I was wondering if there's any example script to do this. Here's what I tried:

from exporters.coreml.models import M2M100CoreMLConfig
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
model_ckpt = "facebook/m2m100_418M"
base_model = M2M100ForConditionalGeneration.from_pretrained(
    model_ckpt, torchscript=True
)
preprocessor = M2M100Tokenizer.from_pretrained(model_ckpt)
coreml_config = M2M100CoreMLConfig(
    base_model.config, 
    task="text2text-generation",
    use_past=False,
)
mlmodel = export(
    preprocessor, base_model, coreml_config
)

However, when trying to run this code, I get the following error:

ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

Thank you in advance!

fakerybakery avatar May 26 '23 20:05 fakerybakery

Hi @fakerybakery! I think the easiest way is to use this automated Space, which uses exporters under the hood: https://huggingface.co/spaces/huggingface-projects/transformers-to-coreml

You enter the model id (facebook/m2m100_418M), then select the task you want the model to perform (text-to-text generation) and the encoder and decoder Core ML models will be pushed to a new repo or submitted as a PR to the original one.

I just followed this procedure and pushed the result to this repo. Feel free to clone it if you need to, or repeat the process yourself using different conversion settings.

Hope that helps.

pcuenca avatar May 30 '23 14:05 pcuenca

Thank you so much! I'm new to CoreML, so do you know if there's an example on how to implement a CoreML text2text-generation model in Swift? I checked huggingface/swift-coreml-transformers, however I couldn't find an example. Thank you!

fakerybakery avatar May 31 '23 08:05 fakerybakery

The project huggingface/swift-coreml-transformers only have created two tokenizers: for GPT and Bert.

If you want to use an M2M100 model you have to find or create an M2M100 swift tokenizer first.

I've been trying also a text2text-generation and the only solution I found is to use a GPT model like microsoft/DialoGPT-small

I based my code in the huggingface/swift-coreml-transformers and created this one madcato/huggingface-coreml to experiment.

Sorry is a little messed. Take a look to my GPT2Model (is almost the same than the original), with a new method to slice faster than the original.

What I recommend you is to find a GPT or Bert model that can be exported.

madcato avatar May 31 '23 08:05 madcato

"Did you succeed? I'm also working with the M2M100 model now and encountering the same issue: 'ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds'. If you succeeded, could you please tell me how you did it? Thank you very much!" @fakerybakery @madcato @pcuenca

beginner-byte avatar Jul 30 '24 07:07 beginner-byte

I successfully converted the models, specifically the encoder and decoder models. Below is the code for the model conversion. However, I have no idea how to use or validate the models.

from exporters.coreml.convert import export
from exporters.coreml.models import M2M100CoreMLConfig
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer

def transform_to_coreml(seq2seq:str):
   
    """
    Base class for Core ML exportable model describing metadata on how to export the model through the Core ML format.

    Args:
        config: The model's configuration to use when exporting to Core ML.
        task: The model topology that will be exported.
        use_past: Export the model with precomputed hidden states (key and values in the
            attention blocks) for fast autoregressive decoding.
        seq2seq: `None` if not an encoder-decoder model, `"encoder"` to export the encoder
            part of a seq2seq model, `"decoder"` to export the decoder part.
    """

    model_ckpt = "facebook/m2m100_418M"
    base_model = M2M100ForConditionalGeneration.from_pretrained(
        model_ckpt, torchscript=True
    )
    preprocessor = M2M100Tokenizer.from_pretrained(model_ckpt)

    coreml_config = M2M100CoreMLConfig(
        base_model.config, 
        task="text2text-generation",
        use_past=False,
        seq2seq= seq2seq
    )
    mlmodel = export(
        preprocessor, base_model, coreml_config
    )
    mlmodel.save(f"m2m100_{seq2seq}.mlpackage")

if __name__ == '__main__':
    transform_to_coreml("decoder")
    transform_to_coreml("encoder")

beginner-byte avatar Jul 30 '24 11:07 beginner-byte

I sincerely request your help. I am a beginner. Thank you very much. @fakerybakery @pcuenca @fakerybakery @madcato

beginner-byte avatar Jul 30 '24 11:07 beginner-byte

I successfully converted the models, specifically the encoder and decoder models. Below is the code for the model conversion. However, I have no idea how to use or validate the models.

from exporters.coreml.convert import export
from exporters.coreml.models import M2M100CoreMLConfig
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer

def transform_to_coreml(seq2seq:str):
   
    """
    Base class for Core ML exportable model describing metadata on how to export the model through the Core ML format.

    Args:
        config: The model's configuration to use when exporting to Core ML.
        task: The model topology that will be exported.
        use_past: Export the model with precomputed hidden states (key and values in the
            attention blocks) for fast autoregressive decoding.
        seq2seq: `None` if not an encoder-decoder model, `"encoder"` to export the encoder
            part of a seq2seq model, `"decoder"` to export the decoder part.
    """

    model_ckpt = "facebook/m2m100_418M"
    base_model = M2M100ForConditionalGeneration.from_pretrained(
        model_ckpt, torchscript=True
    )
    preprocessor = M2M100Tokenizer.from_pretrained(model_ckpt)

    coreml_config = M2M100CoreMLConfig(
        base_model.config, 
        task="text2text-generation",
        use_past=False,
        seq2seq= seq2seq
    )
    mlmodel = export(
        preprocessor, base_model, coreml_config
    )
    mlmodel.save(f"m2m100_{seq2seq}.mlpackage")

if __name__ == '__main__':
    transform_to_coreml("decoder")
    transform_to_coreml("encoder")

I was wondering how you managed to use the converted model ?

EricPeter avatar Aug 06 '24 14:08 EricPeter

@beginner-byte did you manager to use the M2M100 model ?

EricPeter avatar Aug 16 '24 07:08 EricPeter