How to configure GPT4ALL?
I setup the following env vars (as per your docs):
GEN_AI_MODEL_PROVIDER: gpt4all
GEN_AI_MODEL_VERSION: ggml-model-gpt4all-falcon-q4_0.bin
INTERNAL_MODEL_VERSION: gpt4all-chat-completion
But output from API server gives me:
2023-11-07T08:56:11.233517463Z File "/app/danswer/direct_qa/llm_utils.py", line 84, in get_default_qa_model
2023-11-07T08:56:11.233547959Z llm = get_default_llm(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/danswer/llm/build.py", line 48, in get_default_llm
raise ValueError(f"Unknown LLM model: {INTERNAL_MODEL_VERSION}")
2023-11-07T08:56:11.233950634Z ValueError: Unknown LLM model: gpt4all-chat-completion
It does in fact download the model, and I rebuilt the Dockerfile with gpt4all requirement. Am I missing something?
Do I need to add these lines back in? https://github.com/danswer-ai/danswer/pull/233/files#diff-cd54b37b2e4fe499ffd1a5acb2bb4545ab7c28cfd9c834b6c5a9483a528a93f4
I also get when it tries to use the GPT4ALL model
Segmentation fault (core dumped)
https://docs.danswer.dev/gen_ai_configs/gpt_4_all
Where are you seeing the INTERNAL_MODEL_VERSION. I think you may be referring to outdated docs. Please share where it is so I can correct it.
Also, ya the issue where GPT4ALL isn't supported on all platforms is sadly still around. So you'll have to add back the requirement and build the image https://github.com/danswer-ai/danswer/blob/main/backend/requirements/default.txt#L18.
https://docs.danswer.dev/quickstart (see build from source)
https://docs.danswer.dev/gen_ai_configs/gpt_4_all Where are you seeing the INTERNAL_MODEL_VERSION. I think you may be referring to outdated docs. Please share where it is so I can correct it.
It complains about INTERNAL_MODEL_VERSION being blank - this is on my Kubernetes global env branch where values are blank. Ordinarily openai-chat-completion would be in INTERNAL_MODEL_VERSION
I will also look to try and remember about the outdated docs.
If I remove the env var (INTERNAL_MODEL_VERSION) from the deployment:
Using Internal Model: openai-chat-completion
main.py 159 : Actual LLM model version: mistral-7b-openorca.Q4_0.gguf
users.py 66 : Using Auth Type: google_oauth
main.py 170 : Both OAuth Client ID and Secret are configured.
main.py 175 : Using Embedding model: "thenlper/gte-small"
main.py 180 : Warming up local NLP models.
It doesn't download the model '''mistral-7b-openorca.Q4_0.gguf''' - does not exist.
It downloaded the other model by itself (ggml-model-gpt4all-falcon-q4_0.bin)
Also, ya the issue where GPT4ALL isn't supported on all platforms is sadly still around. So you'll have to add back the requirement and build the image https://github.com/danswer-ai/danswer/blob/main/backend/requirements/default.txt#L18.
As stated, I added back the requirement line for gpt4all.
https://docs.danswer.dev/quickstart (see build from source)
Be sure you're on the latest version of the code as well. If you're still seeing INTERNAL_MODEL_VERSION, then you must be on a pretty old version
Be sure you're on the latest version of the code as well. If you're still seeing INTERNAL_MODEL_VERSION, then you must be on a pretty old version
Yes, the branch is reasonably up-to-date (within 1 week)
Is INTERNAL_MODEL_VERSION even used?
It is no longer used, that's why I was wondering if you were on an old version. Is the new docs/environment settings for setting up GPT4All not working for you?
It is no longer used, that's why I was wondering if you were on an old version. Is the new docs/environment settings for setting up GPT4All not working for you?
I just merged 0.2.65 into my branch. So you're saying INTERNAL_MODEL_VERSION is not used in v0.2.65, and I can remove this reference in https://github.com/danswer-ai/danswer/pull/515?
I've downloaded the model manually in /root/.cache/gpt4all (mistral-7b-openorca.Q4_0.gguf)
Ya INTERNAL_MODEL_VERSION is gone. You can refer to the dev docker compose for the most useful env variables
Perfect. Waiting for my build to build all the Dockerfiles and will test again against 0.2.65 tag using just:
GEN_AI_MODEL_PROVIDER=gpt4all
GEN_AI_MODEL_VERSION=mistral-7b-openorca.Q4_0.gguf
Will also update my PR to remove the unused env var.
I gave it another try. I used the options you suggested:
GEN_AI_MODEL_PROVIDER=gpt4all
GEN_AI_MODEL_VERSION=mistral-7b-openorca.Q4_0.gguf
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/danswer/secondary_llm_flows/query_validation.py", line 63, in stream_query_answerability
for token in tokens:
File "/app/danswer/llm/chat_llm.py", line 66, in stream
for token in message_generator_to_string_generator(self.llm.stream(prompt)):
File "/app/danswer/llm/utils.py", line 139, in message_generator_to_string_generator
for message in messages:
File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/base.py", line 220, in stream
raise e
File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/base.py", line 209, in stream
for chunk in self._stream(
File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/litellm.py", line 350, in _stream
for chunk in self.completion_with_retry(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/litellm.py", line 240, in completion_with_retry
return _completion_with_retry(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 289, in wrapped_f
return self(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 379, in __call__
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 325, in iter
raise retry_exc.reraise()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 158, in reraise
raise self.last_attempt.result()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 382, in __call__
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/litellm.py", line 238, in _completion_with_retry
return self.client.completion(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 830, in wrapper
raise e
File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 789, in wrapper
result = original_function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/litellm/timeout.py", line 53, in wrapper
result = future.result(timeout=local_timeout_duration)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/local/lib/python3.11/site-packages/litellm/timeout.py", line 42, in async_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/litellm/main.py", line 1266, in completion
raise exception_type(
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 3338, in exception_type
raise e
File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 3320, in exception_type
raise APIError(status_code=500, message=str(original_exception), llm_provider=custom_llm_provider, model=model)
litellm.exceptions.APIError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/gpt4all/mistral-7b-openorca.Q4_0.gguf',..)` Learn more: https://docs.litellm.ai/docs/providers
[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m
[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.
[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m
[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.
[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m
[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.
Something missing in the docs?
Any time to tell me how to get the local LLM in memory working?
Ah I found a bug, fixed it, please pull main and try again! Don't forget you have to install gpt4all in the container or your environment as we don't install the library by default due to issues with certain architectures
That issue has gone away, it seems, but I can't test it because mistral-7b-openorca.Q4_0.gguf doesn't seem to work on AVX1 CPU's?
double free or corruption (!prev)
2023-11-29T23:32:50.100321020Z Aborted (core dumped)
I tried on AVX2 supported CPU but I don't have the memory on that node for 7B - it isn't a dedicated node just for Danswer. :/
You can probably try some smaller model? Also a lot of people have been liking Ollama, maybe that will work better, you can run Ollama somewhere and point Danswer to it.
I'm going to close the issue since it seems the problem is no longer with the Danswer portion of things. Best of luck!