bumblebee icon indicating copy to clipboard operation
bumblebee copied to clipboard

Support MarkupLM

Open chrisgreg opened this issue 2 years ago • 5 comments

I'm trying to use this huggingface model but I'm getting:

** (Mix) Could not start application extractor: exited in: Extractor.Application.start(:normal, [])
    ** (EXIT) an exception was raised:
        ** (RuntimeError) could not infer model type from the configuration, please specify the :module and :architecture options
            (bumblebee 0.2.0) lib/bumblebee.ex:297: Bumblebee.load_spec/2
            (bumblebee 0.2.0) lib/bumblebee.ex:411: Bumblebee.load_model/2
            (extractor 0.1.0) lib/extractor/application.ex:20: Extractor.Application.start/2
            (kernel 8.1.2) application_master.erl:293: :application_master.start_it_old/4
  {:ok, model} = Bumblebee.load_model({:hf, "microsoft/markuplm-base"})
  {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "microsoft/markuplm-base-finetuned-websrc"})

I can't seem to find the information anywhere in the huggingface docs or config. Can someone guide someone not very versed in ML.

chrisgreg avatar Apr 10 '23 10:04 chrisgreg

I believe it means the model is not supported. We should probably improve the error message to say something like "could not infer model type from the configuration, this model is not supported out of the box by Bumblebee, please specify the :module and :architecture of a custom implementation".

josevalim avatar Apr 10 '23 11:04 josevalim

Hey @chrisgreg! MarkupLM is not supported currently. The error should be more specific, that's what I get for microsoft/markuplm-base:

** (RuntimeError) could not match the class name "MarkupLMForPretraining" to any of the supported models, please specify the :module and :architecture options

jonatanklosko avatar Apr 11 '23 08:04 jonatanklosko

Ahh.. thanks @josevalim and @jonatanklosko - I guess I'll wait until it is unless either of you know of any currently supported models that let me pass it some HTML and detect which is the "main" part of the document with the important content?

I'd love to contribute but I'm very new to anything ML related so I'm far more of a consumer than a contributor at this point 😅

chrisgreg avatar Apr 11 '23 14:04 chrisgreg

I've been looking for models to port over from huggingface transformers. Maybe I'll take a crack at MarkupML

benbot avatar Apr 15 '23 13:04 benbot

I've been looking for models to port over from huggingface transformers. Maybe I'll take a crack at MarkupML

Please do 😄

Papipo avatar Jun 05 '24 10:06 Papipo