llama-cpp-python
llama-cpp-python copied to clipboard
Changing parameters once Llama is initialized makes it not consistent
Hi people 👋🏾 !
While using langchain and llama-cpp-python I've noticed that I had to initialise two instances of the model (one for the embeddings and another one for the inference). I wanted to change that: see this issue: https://github.com/hwchase17/langchain/issues/2630 by allowing to send the same llama "client" when initialising both objects LlamaCpp and LlamaCppEmbeddings and set the embeddings parameter to true/false when needed. However, I've noticed that this is not something it can be done currently with the library:
Reproducible code
from llama_cpp import Llama
llama = Llama(model_path=MODEL_PATH, verbose=False, embedding=False)
response = llama(prompt="Brasil is awesome") # it works
llama.embedding= True
embeddings = llama.embed("Spain is awesome") # it does not work
throwing this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/adria.cabezasantanna/personal_projects/automatic-documentation/venv/lib/python3.9/site-packages/llama_cpp/llama.py", line 315, in embed
return list(map(float, self.create_embedding(input)["data"][0]["embedding"]))
File "/Users/adria.cabezasantanna/personal_projects/automatic-documentation/venv/lib/python3.9/site-packages/llama_cpp/llama.py", line 272, in create_embedding
raise RuntimeError(
RuntimeError: Llama model must be created with embedding=True to call this method
Interestingly, if you initialise it as embeddings=True, you can do both, perform inference and get the embeddings:
from llama_cpp import Llama
llama = Llama(model_path=model_path, verbose=False, embedding=True)
embeddings = llama.embed("Spain is awesome") # it works
response = llama(prompt="Brasil is awesome") # it also works
Is there any reason behind not setting the embeddings parameter always to true? Why can't we change the parameter dynamically?