onyx icon indicating copy to clipboard operation
onyx copied to clipboard

How to configure GPT4ALL?

Open voarsh2 opened this issue 2 years ago • 12 comments

I setup the following env vars (as per your docs):

  GEN_AI_MODEL_PROVIDER: gpt4all
  GEN_AI_MODEL_VERSION: ggml-model-gpt4all-falcon-q4_0.bin
  INTERNAL_MODEL_VERSION: gpt4all-chat-completion

But output from API server gives me:


2023-11-07T08:56:11.233517463Z   File "/app/danswer/direct_qa/llm_utils.py", line 84, in get_default_qa_model
2023-11-07T08:56:11.233547959Z     llm = get_default_llm(timeout=timeout)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/danswer/llm/build.py", line 48, in get_default_llm
    raise ValueError(f"Unknown LLM model: {INTERNAL_MODEL_VERSION}")
2023-11-07T08:56:11.233950634Z ValueError: Unknown LLM model: gpt4all-chat-completion

It does in fact download the model, and I rebuilt the Dockerfile with gpt4all requirement. Am I missing something?

Do I need to add these lines back in? https://github.com/danswer-ai/danswer/pull/233/files#diff-cd54b37b2e4fe499ffd1a5acb2bb4545ab7c28cfd9c834b6c5a9483a528a93f4

I also get when it tries to use the GPT4ALL model

Segmentation fault (core dumped)

voarsh2 avatar Nov 07 '23 08:11 voarsh2

https://docs.danswer.dev/gen_ai_configs/gpt_4_all

Where are you seeing the INTERNAL_MODEL_VERSION. I think you may be referring to outdated docs. Please share where it is so I can correct it.

Also, ya the issue where GPT4ALL isn't supported on all platforms is sadly still around. So you'll have to add back the requirement and build the image https://github.com/danswer-ai/danswer/blob/main/backend/requirements/default.txt#L18.

https://docs.danswer.dev/quickstart (see build from source)

yuhongsun96 avatar Nov 07 '23 23:11 yuhongsun96

https://docs.danswer.dev/gen_ai_configs/gpt_4_all Where are you seeing the INTERNAL_MODEL_VERSION. I think you may be referring to outdated docs. Please share where it is so I can correct it.

It complains about INTERNAL_MODEL_VERSION being blank - this is on my Kubernetes global env branch where values are blank. Ordinarily openai-chat-completion would be in INTERNAL_MODEL_VERSION I will also look to try and remember about the outdated docs.

If I remove the env var (INTERNAL_MODEL_VERSION) from the deployment:

Using Internal Model: openai-chat-completion
main.py 159 : Actual LLM model version: mistral-7b-openorca.Q4_0.gguf
users.py  66 : Using Auth Type: google_oauth
main.py 170 : Both OAuth Client ID and Secret are configured.
main.py 175 : Using Embedding model: "thenlper/gte-small"
main.py 180 : Warming up local NLP models.

It doesn't download the model '''mistral-7b-openorca.Q4_0.gguf''' - does not exist. It downloaded the other model by itself (ggml-model-gpt4all-falcon-q4_0.bin)

Also, ya the issue where GPT4ALL isn't supported on all platforms is sadly still around. So you'll have to add back the requirement and build the image https://github.com/danswer-ai/danswer/blob/main/backend/requirements/default.txt#L18.

As stated, I added back the requirement line for gpt4all.

https://docs.danswer.dev/quickstart (see build from source)

voarsh2 avatar Nov 08 '23 00:11 voarsh2

Be sure you're on the latest version of the code as well. If you're still seeing INTERNAL_MODEL_VERSION, then you must be on a pretty old version

yuhongsun96 avatar Nov 08 '23 01:11 yuhongsun96

Be sure you're on the latest version of the code as well. If you're still seeing INTERNAL_MODEL_VERSION, then you must be on a pretty old version

Yes, the branch is reasonably up-to-date (within 1 week) Is INTERNAL_MODEL_VERSION even used?

voarsh2 avatar Nov 08 '23 01:11 voarsh2

It is no longer used, that's why I was wondering if you were on an old version. Is the new docs/environment settings for setting up GPT4All not working for you?

yuhongsun96 avatar Nov 08 '23 01:11 yuhongsun96

It is no longer used, that's why I was wondering if you were on an old version. Is the new docs/environment settings for setting up GPT4All not working for you?

I just merged 0.2.65 into my branch. So you're saying INTERNAL_MODEL_VERSION is not used in v0.2.65, and I can remove this reference in https://github.com/danswer-ai/danswer/pull/515?

I've downloaded the model manually in /root/.cache/gpt4all (mistral-7b-openorca.Q4_0.gguf)

voarsh2 avatar Nov 08 '23 01:11 voarsh2

Ya INTERNAL_MODEL_VERSION is gone. You can refer to the dev docker compose for the most useful env variables

yuhongsun96 avatar Nov 08 '23 02:11 yuhongsun96

Perfect. Waiting for my build to build all the Dockerfiles and will test again against 0.2.65 tag using just:

GEN_AI_MODEL_PROVIDER=gpt4all
GEN_AI_MODEL_VERSION=mistral-7b-openorca.Q4_0.gguf

Will also update my PR to remove the unused env var.

voarsh2 avatar Nov 08 '23 02:11 voarsh2

I gave it another try. I used the options you suggested:

GEN_AI_MODEL_PROVIDER=gpt4all
GEN_AI_MODEL_VERSION=mistral-7b-openorca.Q4_0.gguf
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/danswer/secondary_llm_flows/query_validation.py", line 63, in stream_query_answerability
    for token in tokens:
  File "/app/danswer/llm/chat_llm.py", line 66, in stream
    for token in message_generator_to_string_generator(self.llm.stream(prompt)):
  File "/app/danswer/llm/utils.py", line 139, in message_generator_to_string_generator
    for message in messages:
  File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/base.py", line 220, in stream
    raise e
  File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/base.py", line 209, in stream
    for chunk in self._stream(
  File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/litellm.py", line 350, in _stream
    for chunk in self.completion_with_retry(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/litellm.py", line 240, in completion_with_retry
    return _completion_with_retry(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 289, in wrapped_f
    return self(f, *args, **kw)
           ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 379, in __call__
    do = self.iter(retry_state=retry_state)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 325, in iter
    raise retry_exc.reraise()
          ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 158, in reraise
    raise self.last_attempt.result()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/tenacity/__init__.py", line 382, in __call__
    result = fn(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/langchain/chat_models/litellm.py", line 238, in _completion_with_retry
    return self.client.completion(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 830, in wrapper
    raise e
  File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 789, in wrapper
    result = original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/timeout.py", line 53, in wrapper
    result = future.result(timeout=local_timeout_duration)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.11/site-packages/litellm/timeout.py", line 42, in async_func
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/main.py", line 1266, in completion
    raise exception_type(
          ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 3338, in exception_type
    raise e
  File "/usr/local/lib/python3.11/site-packages/litellm/utils.py", line 3320, in exception_type
    raise APIError(status_code=500, message=str(original_exception), llm_provider=custom_llm_provider, model=model)
litellm.exceptions.APIError: LLM Provider NOT provided. Pass in the LLM provider you are trying to call. E.g. For 'Huggingface' inference endpoints pass in `completion(model='huggingface/gpt4all/mistral-7b-openorca.Q4_0.gguf',..)` Learn more: https://docs.litellm.ai/docs/providers

[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m


[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.


[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m


[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.


[1;31mProvider List: https://docs.litellm.ai/docs/providers[0m


[1;31mGive Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new[0m
LiteLLM.Info: If you need to debug this error, use `litellm.set_verbose=True'.
                                                            

Something missing in the docs?

voarsh2 avatar Nov 08 '23 19:11 voarsh2

Any time to tell me how to get the local LLM in memory working?

voarsh2 avatar Nov 17 '23 23:11 voarsh2

Ah I found a bug, fixed it, please pull main and try again! Don't forget you have to install gpt4all in the container or your environment as we don't install the library by default due to issues with certain architectures

yuhongsun96 avatar Nov 19 '23 05:11 yuhongsun96

That issue has gone away, it seems, but I can't test it because mistral-7b-openorca.Q4_0.gguf doesn't seem to work on AVX1 CPU's?

double free or corruption (!prev)
2023-11-29T23:32:50.100321020Z Aborted (core dumped)

I tried on AVX2 supported CPU but I don't have the memory on that node for 7B - it isn't a dedicated node just for Danswer. :/

voarsh2 avatar Nov 29 '23 23:11 voarsh2

You can probably try some smaller model? Also a lot of people have been liking Ollama, maybe that will work better, you can run Ollama somewhere and point Danswer to it.

I'm going to close the issue since it seems the problem is no longer with the Danswer portion of things. Best of luck!

yuhongsun96 avatar Jan 08 '24 05:01 yuhongsun96