GLiNER How to use Custom Tokenizer after trained the model?

How to use Custom Tokenizer after trained the model?

Open uraibeefnn opened this issue 5 months ago • 0 comments

I hope you're doing well. I have a question regarding the usage of a custom tokenizer after training a model using your library.

I have trained a model using the GLiNER framework, but now I want to use a custom tokenizer, specifically tokenizer for Thai language. I noticed some issues when the tokenizer’s vocabulary size doesn’t match the model's expected vocab size, and I'm trying to figure out the best way to resolve this.

Could you please guide me on the following:

How do I load and use a custom tokenizer with a trained model? Is there any recommended method to update or resize token embeddings if additional tokens are added to the tokenizer? I would appreciate any advice on how to ensure the tokenizer and model are compatible after training.

Sep 20 '24 10:09 uraibeefnn

GLiNER GLiNER copied to clipboard

How to use Custom Tokenizer after trained the model?

GLiNER
GLiNER copied to clipboard