llama-cpp-python Changing parameters once Llama is initialized makes it not consistent

Changing parameters once Llama is initialized makes it not consistent

Open adriacabeza opened this issue 2 years ago • 0 comments

Hi people 👋🏾 !

While using langchain and llama-cpp-python I've noticed that I had to initialise two instances of the model (one for the embeddings and another one for the inference). I wanted to change that: see this issue: https://github.com/hwchase17/langchain/issues/2630 by allowing to send the same llama "client" when initialising both objects LlamaCpp and LlamaCppEmbeddings and set the embeddings parameter to true/false when needed. However, I've noticed that this is not something it can be done currently with the library:

Reproducible code

from llama_cpp import Llama
llama = Llama(model_path=MODEL_PATH, verbose=False, embedding=False)
response = llama(prompt="Brasil is awesome") # it works
llama.embedding= True
embeddings = llama.embed("Spain is awesome") # it does not work

throwing this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/adria.cabezasantanna/personal_projects/automatic-documentation/venv/lib/python3.9/site-packages/llama_cpp/llama.py", line 315, in embed
    return list(map(float, self.create_embedding(input)["data"][0]["embedding"]))
  File "/Users/adria.cabezasantanna/personal_projects/automatic-documentation/venv/lib/python3.9/site-packages/llama_cpp/llama.py", line 272, in create_embedding
    raise RuntimeError(
RuntimeError: Llama model must be created with embedding=True to call this method

Interestingly, if you initialise it as embeddings=True, you can do both, perform inference and get the embeddings:

from llama_cpp import Llama
llama = Llama(model_path=model_path, verbose=False, embedding=True)
embeddings = llama.embed("Spain is awesome") # it works
response = llama(prompt="Brasil is awesome") # it also works

Is there any reason behind not setting the embeddings parameter always to true? Why can't we change the parameter dynamically?

Apr 16 '23 10:04 adriacabeza

llama-cpp-python llama-cpp-python copied to clipboard

Changing parameters once Llama is initialized makes it not consistent

llama-cpp-python
llama-cpp-python copied to clipboard