flair icon indicating copy to clipboard operation
flair copied to clipboard

[Question]: Is a config.json file available?

Open katwegner opened this issue 2 years ago • 6 comments

Question

Dear developers,

I am a keen user of your FLAIR models and am looking for a configuration file to enable the model deployment on a cluster. I’m in awe of the FLAIR project, as it does an outstanding job in recognising name entities in the German language like no other available model. In order to fully use the model on our data, we would like to deploy the model to a cluster. Unfortunately, the cluster requires that we provide a config.json file that describes the model details. The open-source and free nature of the library suggests that it could be your goal that the user community and the versatility of use cases continually grow. Therefore, I would like to ask you whether you would like to make such a configuration file available to the public so that FLAIR models can also be used on the cluster. Specifically, I’m referring to the ner-german-large library. Thank you very much for considering this request and I’m looking forward to hearing back from you. Best, Katharina

katwegner avatar May 22 '23 11:05 katwegner

Hello @katwegner,

I suppose with the config.json you are referring to the setup from a huggingface/transformers model? This is not possible for flair, as flair builds up on several kinds of embeddings - possibly combining multiple - and adding additional layers.

About the deployment: what are you trying to use for deployment? I can assure that making a docker-container running a simple rest service can be deployed on any k9s cluster or cloud service. It might be, that there are specific frameworks that make it easier to deploy certain models, but then it might make more sense to ask them to add support for flair models.

helpmefindaname avatar May 22 '23 12:05 helpmefindaname

@helpmefindaname Are you sure config.json is not possible? When I run something like tagger = SequenceTagger.load("flair/ner-english-large")

I see this

pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2.24G/2.24G [02:27<00:00, 15.2MB/s]
tokenizer_config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25.0/25.0 [00:00<00:00, 367kB/s]
config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 616/616 [00:00<00:00, 13.1MB/s]
sentencepiece.bpe.model: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5.07M/5.07M [00:00<00:00, 11.7MB/s]
tokenizer.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.10M/9.10M [00:00<00:00, 14.2MB/s]

If config.json is not possible then where are these files coming from? Or is this for something else?

davidgxue avatar Mar 26 '24 15:03 davidgxue

Hi @davidgxue

I think that

This is not possible for flair, as flair builds up on several kinds of embeddings - possibly combining multiple - and adding additional layers.

already answers your questions. Flair models can use those huggingface models as embeddings, but compose more than just that.

helpmefindaname avatar Mar 29 '24 12:03 helpmefindaname

Uh oh. Was trying loading these high-performance Flair models into bumblebee. So, it seems impossible.

SichangHe avatar Jul 07 '24 14:07 SichangHe

@SichangHe looking at the bulblebee docs, I don't see anything suggesting that they are supporting Flair models,

I suppose you could open a featurerequest on their side, but I am not sure, if they'd want to.

helpmefindaname avatar Jul 19 '24 11:07 helpmefindaname