langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Switch from pyllamacpp to the nomic-ai/pygpt4all bindings for gpt4all

Open npow opened this issue 1 year ago • 5 comments

Switch from pyllamacpp to the nomic-ai/pygpt4all bindings for gpt4all.

Fixes #3725 and https://github.com/nomic-ai/gpt4all/issues/468

npow avatar Apr 30 '23 16:04 npow

Thank you for the fix @npow ! Would it be helpful to add a import integration test? so that we can catch it early next time? Do you mind updating the jupyter notebook example too? Thanks so much!

notebook link: https://github.com/hwchase17/langchain/blob/master/docs/modules/models/llms/integrations/gpt4all.ipynb integration test: https://github.com/hwchase17/langchain/blob/master/tests/integration_tests/llms/test_gpt4all.py#LL9C1-L9C1 (it doesn’t look like we have an import test. it’s also very interesting that the test_gpt4all_inference() test didn’t trigger the error. hmm)

Please lmk if you have time for the 2 suggested updates. If no, I’m happy to do it for you. :) Thank you for your contribution!

skcoirz avatar Apr 30 '23 17:04 skcoirz

Whoops my bad, forgot to check the tests -- I think the test is working because it's using an older model. The new models from nomicai/pygpt4all have their own bindings.

npow avatar Apr 30 '23 19:04 npow

Updated the docs and the test. Not sure what you mean by "import test" -- should we change the model_url to be a newer one?

npow avatar Apr 30 '23 19:04 npow

Updated the docs and the test. Not sure what you mean by "import test" -- should we change the model_url to be a newer one?

As mentioned in the original request, this python binding is no long compatible with pyllamacpp , instead it’s requested to use the new repo pygpt4all. I’m thinking if there is any warning or error which can tell us the old combination no longer works? That can be a good validation check in case of similar future conflicts. Sorry about confusion. I’m referring import check to dependency / compatibility validation.

skcoirz avatar Apr 30 '23 20:04 skcoirz

Ok I see what you mean. We could try catching the TypeError exception around self.client.generate() at https://github.com/hwchase17/langchain/blob/master/langchain/llms/gpt4all.py#L186 but that's not a very robust solution. WDYT?

For reference, the original error is: TypeError: Model.generate() got an unexpected keyword argument 'new_text_callback'

npow avatar Apr 30 '23 21:04 npow

Ok I see what you mean. We could try catching the TypeError exception around self.client.generate() at https://github.com/hwchase17/langchain/blob/master/langchain/llms/gpt4all.py#L186 but that's not a very robust solution. WDYT?

For reference, the original error is: TypeError: Model.generate() got an unexpected keyword argument 'new_text_callback'

I see. Agreed . Adding a try catch is tricky as this is a function<>input_key conflict. A more ideal way is to do type check in advance, which should be here. Instead of Any, we can make it more explicit. No sure if it's supported by GPT4All library. Will need to check. (code pointer: https://github.com/hwchase17/langchain/blob/master/langchain/llms/gpt4all.py#L89 )

Another general question is how to improve the user experience, should we try all exceptions and then provide more concrete and enriched instruction or just let it raise and rely on the users to fix it by themselves. It seems there is a balance. For common issues, we can fetch it and provide enriched feedbacks to the user. But anyway, let me think about how to improve the experience here. Thank you so much for feedbacks and collaboration!

skcoirz avatar May 01 '23 19:05 skcoirz

@npow this does seem to solve #3839 as per my testing, however the notebook you've (maybe you, not sure) updated here seems a bit jumbled on the order of operations. Is that something I should be bringing up here or in a separate issue? Apologies if this is an odd request, this is my first foray into open source

DarpanGanatra avatar May 01 '23 19:05 DarpanGanatra

@DarpanGanatra I don't think that's me, because I didn't change any of the outputs of the notebook -- that link also doesn't have my change.

npow avatar May 02 '23 01:05 npow

@npow this does seem to solve #3839 as per my testing, however the notebook you've (maybe you, not sure) updated here seems a bit jumbled on the order of operations. Is that something I should be bringing up here or in a separate issue? Apologies if this is an odd request, this is my first foray into open source

Could you elaborate? Not sure what you mean here.

If it's about ordering of specifying the local path vs. the download, the local_path string is used in the download script as the local file target

vowelparrot avatar May 02 '23 03:05 vowelparrot

was this switch complete? is the following error caused by pyllamacpp?

the following code

from langchain.llms import GPT4All from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

local_path = './models/ggml-gpt4all-j-v1.3-groovy.bin' callbacks = [StreamingStdOutCallbackHandler()] llm = GPT4All(model=local_path, callbacks=callbacks, verbose=True)

produces this error

llama_model_load: loading model from './models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ... llama_model_load: invalid model file './models/ggml-gpt4all-j-v1.3-groovy.bin' (too old, regenerate your model files or convert them with convert-unversioned-ggml-to-ggml.py!) llama_init_from_file: failed to load model

jackxwu avatar May 06 '23 06:05 jackxwu

No worries, I believe this may be separate conversation about documentation

On Mon, May 1, 2023 at 11:17 PM Zander Chase @.***> wrote:

@npow https://github.com/npow this does seem to solve #3839 https://github.com/hwchase17/langchain/issues/3839 as per my testing, however the notebook you've (maybe you, not sure) updated here https://python.langchain.com/en/latest/modules/models/llms/integrations/gpt4all.html seems a bit jumbled on the order of operations. Is that something I should be bringing up here or in a separate issue? Apologies if this is an odd request, this is my first foray into open source

Could you elaborate? Not sure what you mean here.

If it's about ordering of specifying the local path vs. the download, the local_path string is used in the download script as the local file target

— Reply to this email directly, view it on GitHub https://github.com/hwchase17/langchain/pull/3837#issuecomment-1530813004, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEKBJIL7PXY4X2I65LXULV3XEB4EFANCNFSM6AAAAAAXRA56UU . You are receiving this because you were mentioned.Message ID: @.***>

DarpanGanatra avatar May 07 '23 16:05 DarpanGanatra