flair icon indicating copy to clipboard operation
flair copied to clipboard

Add push to Hub functionalities

Open osanseviero opened this issue 1 year ago • 7 comments

This is a follow-up of #2280

Flair users would benefit by being able to easily share their models and try them out with the Hub widgets or load models from others. Here is an example repo created out of it

model = SequenceTagger.load(...)
model.push_to_hub('my-own-ner-english')

This also implements a basic automatic model card generation with the appropriate tags for Flair models to be discoverable. This uses a non-git method so users are not required to use git-locally, even if in the remote server we do have git-versioning.

Let us know what you think! cc @stefan-it

osanseviero avatar Aug 08 '22 17:08 osanseviero

cc @Wauplin and @LysandreJik :hugs:

osanseviero avatar Aug 08 '22 17:08 osanseviero

@osanseviero looks good, though when I run this code:

from flair.models import SequenceTagger

model: SequenceTagger = SequenceTagger.load('ner-fast')
model.push_to_hub(repo_id='flair/test-push', private=True)

it fails with

OSError: You need to provide a `token` or be logged in to Hugging Face with `huggingface-cli login`.

The first part of the error message is a bit hard to parse. How can I provide a token?

alanakbik avatar Aug 30 '22 14:08 alanakbik

Could be related to #2919 - it would be good to have a consistent way of passing tokens.

alanakbik avatar Aug 30 '22 14:08 alanakbik

Ah good point about the error message! I was assuming user is already logged in through huggingface-cli login. The utilities (upload_folder and create_repo) automatically retrieve the token.

What we can do is add a token parameter to push_to_hub which would be None by default (when it's None we retrieve the token automatically). Do you think this would work?

The token for loading will be a bit more complex as this would require changing the loading function

osanseviero avatar Aug 30 '22 14:08 osanseviero

Yes, a token parameter that defaults to None makes sense here!

alanakbik avatar Aug 30 '22 21:08 alanakbik

Done! Here is a repo I created by passing the token.

osanseviero avatar Aug 31 '22 09:08 osanseviero

Looks like mypy is throwing some odd errors again, somehow checking code in some cache directory:

FAILED cache/flair/datasets/senteval/data/convert.py::flake-8::FLAKE8
FAILED cache/flair/datasets/senteval/data/convert.py::ISORT
FAILED cache/flair/datasets/senteval/data/mytokenize.py::mypy
FAILED cache/flair/datasets/senteval/data/mytokenize.py::flake-8::FLAKE8
FAILED cache/flair/datasets/senteval/data/mytokenize.py::ISORT

@helpmefindaname do you have an idea what is going on here?

alanakbik avatar Aug 31 '22 15:08 alanakbik

Hey there :) it seems things were fixed, should we merge this PR?

osanseviero avatar Sep 30 '22 08:09 osanseviero

Hello @osanseviero sorry for the delay in checking this.

I've created a token and pushed a small model, experimenting with both a public and private repo:

For public I used:

model: SequenceTagger = SequenceTagger.load('ner-fast')
model.push_to_hub(repo_id='alanakbik/test-push-public', private=False, token="[ .. my token ..]")

For private I used:

model: SequenceTagger = SequenceTagger.load('ner-fast')
model.push_to_hub(repo_id='alanakbik/test-push-private', private=True, token="[ .. my token ..]")

The public one works, the private one doesn't. I am getting this error:

Repository Not Found for url: https://huggingface.co/api/models/alanakbik/test-push-private.
Please make sure you specified the correct `repo_id` and `repo_type`.
If the repo is private, make sure you are authenticated.

In spite of the error, a private repo is created at https://huggingface.co/alanakbik/test-push-private, but it is empty. Any ideas what I am doing wrong?

alanakbik avatar Oct 27 '22 13:10 alanakbik

@osanseviero thanks for adding this and @Wauplin thanks for finding the fix for private repos!

alanakbik avatar Oct 27 '22 15:10 alanakbik