unilm
unilm copied to clipboard
[markuplm] Unable to use with Huggingface
Describe the bug Model: markuplm
The problem arises when using:
- [x] the official example scripts: (give details below)
A clear and concise description of what the bug is.
To Reproduce Steps to reproduce the behavior:
pip install transformers
# Or main source code "git clone https://github.com/huggingface/transformers && cd transformers && pip install ."
from transformers import AutoTokenizer, MarkupLMForPretraining
tokenizer = AutoTokenizer.from_pretrained("microsoft/markuplm-large")
model = MarkupLMForPretraining.from_pretrained("microsoft/markuplm-large")
ValueError: Tokenizer class MarkupLMTokenizer does not exist or is not currently imported.
Expected behavior A clear and concise description of what you expected to happen. The tokenizer and model are properly loaded.
-
Platform: Google Colab
-
Python version:
-
PyTorch version (GPU?):
Now MarkupLM is not supported by the package transformers
of huggingface, so you can only use it by downloading our source code. We will work on it to make MarkupLM appear on transformers
soon.
Hi,
I've added MarkupLM to Transformers here: https://github.com/NielsRogge/transformers/tree/modeling_markuplm/src/transformers/models/markuplm
However, I've not opened a PR yet, as I'd like to have a MarkupLProcessor
(similar to LayoutLMv2Processor
), that allows to prepare all data for the model (rather than only tokenizing text).
Feel free to work further on my branch.
@NielsRogge Thanks for adding MakupLM into the great transformers
library! We have add a processor for MarkupLM
like LayoutLMv2Processor
as you required, and opened a PR under your branch. However this implementation is not so complete as we are not familiar with all the apis in transformers
. We would appreciate it very much if you can kindly help us improve and officially release it.
@NielsRogge Any updates for adding MarkupLM to Transformers?
@NielsRogge you are amazing. Thank you for this!
MarkupLM is now part of the Transformers library, feel free to close this issue :)
-
Docs: https://huggingface.co/docs/transformers/model_doc/markuplm
-
Demo notebooks: https://github.com/NielsRogge/Transformers-Tutorials/tree/master/MarkupLM