KeyBERT icon indicating copy to clipboard operation
KeyBERT copied to clipboard

Does kerbert going to support LLaMA?

Open thtang opened this issue 2 years ago • 1 comments

Hi, I received an error once I change the model with decapoda-research/llama-7b-hf. Is this error derived from sentence-transformer?

ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as pad_token (tokenizer.pad_token = tokenizer.eos_token e.g.) or add a new pad token via tokenizer.add_special_tokens({'pad_token': '[PAD]'}).

thtang avatar May 14 '23 06:05 thtang

Thanks for sharing. The model that you gave KeyBERT is meant for creating embeddings and not for performing the keyword search itself. It should be possible to integrate it within KeyBERT but since its procedure is quite different from how KeyBERT works, many parameters would not have an effect, such as use_mmr, use_maxsum, vectorizer, doc_embeddings, word_embeddings, etc.

MaartenGr avatar May 15 '23 11:05 MaartenGr