flair icon indicating copy to clipboard operation
flair copied to clipboard

[Bug]: cannot load 'de-pos-fine-grained'

Open lykoerber opened this issue 2 years ago • 6 comments

Describe the bug

Hi! I tried to use the fine-grained PoS-Tagger for German Twitter data described here, but unfortunately, I ran into an error trying to load the model. Is this model still included in flair? Otherwise, can you recommend another fine-grained flair model for PoS-tagging for German (apart from basic de-pos)? Many thanks in advance!

To Reproduce

import flair
model = flair.models.SequenceTagger.load('de-pos-fine-grained')

Expected behavior

no error?

Logs and Stack traces

Repository Not Found for url: https://huggingface.co/de-pos-fine-grained/resolve/main/pytorch_model.bin.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

Screenshots

No response

Additional Context

No response

Environment

lykoerber avatar May 12 '23 10:05 lykoerber

Hello @LydiaKoerber , Looking at the commit history, I think it was only added for version 4 but never got merged into master (and therefore isn't there for any future versions).

The model is in a old version and I will talk to @alanakbik to provide an uptodate version.

In the meantime you can download the model from https://nlp.informatik.hu-berlin.de/resources/models/de-pos-tweets/de-pos-twitter-v0.1.pt when loading the model, you need to add some simple fix:

tagger = SequenceTagger.load("de-pos-twitter.pt")  # the downloaded model
tagger.embeddings.embeddings[1].chars_per_chunk=512  # add a missing value not stored in the model
tagger.embeddings.embeddings[2].chars_per_chunk=512

helpmefindaname avatar May 15 '23 12:05 helpmefindaname

If I remember correctly, this model was actually trained by @stefan-it.

alanakbik avatar May 15 '23 13:05 alanakbik

Ah, so we could upload the model to the Model Hub?

stefan-it avatar May 15 '23 14:05 stefan-it

Hi @LydiaKoerber ,

after conversation with @alanakbik we decided that it would be better to have a re-trained version on the model hub, because lot of internal modules in the model are pretty depracted and outdated (For example CharLMEmbeddings, whereas FlairEmbeddings should be used now).

So I re-trained the model and documented it here.

I also uploaded the new model to the Hugging Face Model Hub - demo usage with latest Flair version:

from flair.data import Sentence
from flair.models import SequenceTagger

model = SequenceTagger.load('flair/de-pos-fine-grained')
sent = Sentence("@Sneeekas Ich nicht \o/", use_tokenizer=False)
model.predict(sent)

print(sent)

this should correctly return:

Sentence[4]: "@Sneeekas Ich nicht \o/" → ["@Sneeekas"/ADDRESS, "Ich"/PPER, "nicht"/PTKNEG, "\o/"/EMO]

I will prepare PR for Flair (documentation update) and model card on the Model Hub soon!

stefan-it avatar May 15 '23 22:05 stefan-it

Great, thanks a lot!

lykoerber avatar May 16 '23 08:05 lykoerber

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 17 '23 01:09 stale[bot]