Langchain-Chatchat
Langchain-Chatchat copied to clipboard
cpu上运行webui,step3 asking时报错
web运行,文件加载都正常,asking时报错
README.txt 已成功加载 Traceback (most recent call last): File "/home/chwang/.local/lib/python3.8/site-packages/gradio/routes.py", line 394, in run_predict output = await app.get_blocks().process_api( File "/home/chwang/.local/lib/python3.8/site-packages/gradio/blocks.py", line 1075, in process_api result = await self.call_function( File "/home/chwang/.local/lib/python3.8/site-packages/gradio/blocks.py", line 884, in call_function prediction = await anyio.to_thread.run_sync( File "/home/chwang/.local/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/chwang/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/home/chwang/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "webui.py", line 31, in get_answer resp, history = kb.get_knowledge_based_answer( File "/repo/chaowang/AI/langchain-ChatGLM/knowledge_based_chatglm.py", line 98, in get_knowledge_based_answer retriever=vector_store.as_retriever(search_kwargs={"k": VECTOR_SEARCH_TOP_K}), AttributeError: 'NoneType' object has no attribute 'as_retriever'
按照报错来说应该是文档已完成load,但还未完成vector_store的生成,完成vector_store生成后前端会有回复。
重新尝试,多等了一会,前端有文档已完成加载的提示了,asking一直是...的等待状态。

后端程序还在运行,但是有如下报错: The dtype of attention mask (torch.int64) is not bool
cpu 上运行单次回复可能耗时比较长,建议可以先考虑使用命令行 demo 进行测试,目前 Web UI 还在持续完善中。
目前已更新v 0.1.0新版本,建议更新至最新版代码后再测试看看
更新很快啊,赞,已用最新鲜版本跑了,24core cpu,推理等了挺久,然后报错:
python3 ./webui.py
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Downloading (…)/modeling_chatglm.py: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57.6k/57.6k [00:00<00:00, 158kB/s]
2023-04-19 08:56:51.779202: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:07<00:00, 1.12it/s]
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [01:07<00:00, 8.41s/it]
No sentence-transformers model found with name /home/chwang/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling.
No sentence-transformers model found with name /home/chwang/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling.
Running on local URL: http://0.0.0.0:7860
To create a public link, set share=True in launch().
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
No compiled kernel found.
Compiling kernels : /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.c -shared -o /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so
Kernels compiled : /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so
Load kernel : /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so
Setting CPU quantization kernel threads to 12
Using quantization cache
Applying quantization to glm layers
No sentence-transformers model found with name /home/chwang/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling.
No sentence-transformers model found with name /home/chwang/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling.
content/README.txt 已成功加载
The dtype of attention mask (torch.int64) is not bool
Traceback (most recent call last):
File "/home/chwang/.local/lib/python3.8/site-packages/gradio/routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "/home/chwang/.local/lib/python3.8/site-packages/gradio/blocks.py", line 1075, in process_api
result = await self.call_function(
File "/home/chwang/.local/lib/python3.8/site-packages/gradio/blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "/home/chwang/.local/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/home/chwang/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/home/chwang/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "./webui.py", line 44, in get_answer
resp, history = local_doc_qa.get_knowledge_based_answer(
File "/repo/chaowang/AI/langchain-ChatGLM/chains/local_doc_qa.py", line 114, in get_knowledge_based_answer
result = knowledge_chain({"query": query})
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/base.py", line 116, in call
raise e
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/base.py", line 113, in call
outputs = self._call(inputs)
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/retrieval_qa/base.py", line 110, in _call
answer, _ = self.combine_documents_chain.combine_docs(docs, question=question)
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/combine_documents/stuff.py", line 89, in combine_docs
return self.llm_chain.predict(**inputs), {}
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/llm.py", line 151, in predict
return self(kwargs)[self.output_key]
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/base.py", line 116, in call
raise e
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/base.py", line 113, in call
outputs = self._call(inputs)
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/llm.py", line 57, in _call
return self.apply([inputs])[0]
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/llm.py", line 118, in apply
response = self.generate(input_list)
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/llm.py", line 62, in generate
return self.llm.generate_prompt(prompts, stop)
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/llms/base.py", line 107, in generate_prompt
return self.generate(prompt_strings, stop=stop)
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/llms/base.py", line 140, in generate
raise e
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/llms/base.py", line 137, in generate
output = self._generate(prompts, stop=stop)
File "/home/chwang/.local/lib/python3.8/site-packages/langchain/llms/base.py", line 324, in _generate
text = self._call(prompt, stop=stop)
File "/repo/chaowang/AI/langchain-ChatGLM/models/chatglm_llm.py", line 71, in _call
response, _ = self.model.chat(
File "/usr/local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/modeling_chatglm.py", line 1255, in chat
response = tokenizer.decode(outputs)
File "/home/chwang/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 3474, in decode
token_ids = to_py_obj(token_ids)
File "/home/chwang/.local/lib/python3.8/site-packages/transformers/utils/generic.py", line 174, in to_py_obj
return [to_py_obj(o) for o in obj]
File "/home/chwang/.local/lib/python3.8/site-packages/transformers/utils/generic.py", line 174, in
请检查一下protobuf版本或重新pip install -r requirements.txt
wangchao @.***>于2023年4月19日 周三18:02写道:
更新很快啊,赞,已用最新鲜版本跑了,24core cpu,推理等了挺久,然后报错:
python3 ./webui.py Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Downloading (…)/modeling_chatglm.py: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57.6k/57.6k [00:00<00:00, 158kB/s] 2023-04-19 08:56:51.779202: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. Downloading shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:07<00:00, 1.12it/s] Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [01:07<00:00, 8.41s/it] No sentence-transformers model found with name /home/chwang/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling. No sentence-transformers model found with name /home/chwang/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling. Running on local URL: http://0.0.0.0:7860
To create a public link, set share=True in launch(). Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. No compiled kernel found. Compiling kernels : /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.c Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.c -shared -o /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so Kernels compiled : /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so Load kernel : /home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/quantization_kernels_parallel.so Setting CPU quantization kernel threads to 12 Using quantization cache Applying quantization to glm layers No sentence-transformers model found with name /home/chwang/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling. No sentence-transformers model found with name /home/chwang/.cache/torch/sentence_transformers/GanymedeNil_text2vec-large-chinese. Creating a new one with MEAN pooling. content/README.txt 已成功加载 The dtype of attention mask (torch.int64) is not bool Traceback (most recent call last): File "/home/chwang/.local/lib/python3.8/site-packages/gradio/routes.py", line 394, in run_predict output = await app.get_blocks().process_api( File "/home/chwang/.local/lib/python3.8/site-packages/gradio/blocks.py", line 1075, in process_api result = await self.call_function( File "/home/chwang/.local/lib/python3.8/site-packages/gradio/blocks.py", line 884, in call_function prediction = await anyio.to_thread.run_sync( File "/home/chwang/.local/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/chwang/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/home/chwang/.local/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "./webui.py", line 44, in get_answer resp, history = local_doc_qa.get_knowledge_based_answer( File "/repo/chaowang/AI/langchain-ChatGLM/chains/local_doc_qa.py", line 114, in get_knowledge_based_answer result = knowledge_chain({"query": query}) File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/base.py", line 116, in call raise e File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/base.py", line 113, in call outputs = self._call(inputs) File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/retrieval_qa/base.py", line 110, in _call answer, _ = self.combine_documents_chain.combine_docs(docs, question=question) File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/combine_documents/stuff.py", line 89, in combine_docs return self.llm_chain.predict(**inputs), {} File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/llm.py", line 151, in predict return self(kwargs)[self.output_key] File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/base.py", line 116, in call raise e File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/base.py", line 113, in call outputs = self._call(inputs) File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/llm.py", line 57, in _call return self.apply([inputs])[0] File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/llm.py", line 118, in apply response = self.generate(input_list) File "/home/chwang/.local/lib/python3.8/site-packages/langchain/chains/llm.py", line 62, in generate return self.llm.generate_prompt(prompts, stop) File "/home/chwang/.local/lib/python3.8/site-packages/langchain/llms/base.py", line 107, in generate_prompt return self.generate(prompt_strings, stop=stop) File "/home/chwang/.local/lib/python3.8/site-packages/langchain/llms/base.py", line 140, in generate raise e File "/home/chwang/.local/lib/python3.8/site-packages/langchain/llms/base.py", line 137, in generate output = self._generate(prompts, stop=stop) File "/home/chwang/.local/lib/python3.8/site-packages/langchain/llms/base.py", line 324, in _generate text = self._call(prompt, stop=stop) File "/repo/chaowang/AI/langchain-ChatGLM/models/chatglm_llm.py", line 71, in _call response, _ = self.model.chat( File "/usr/local/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/chwang/.cache/huggingface/modules/transformers_modules/THUDM/chatglm-6b-int4-qe/977d9df4cfae6b7a756e07698483872c5c070eee/modeling_chatglm.py", line 1255, in chat response = tokenizer.decode(outputs) File "/home/chwang/.local/lib/python3.8/site-packages/transformers/tokenization_utils_base.py", line 3474, in decode token_ids = to_py_obj(token_ids) File "/home/chwang/.local/lib/python3.8/site-packages/transformers/utils/generic.py", line 174, in to_py_obj return [to_py_obj(o) for o in obj] File "/home/chwang/.local/lib/python3.8/site-packages/transformers/utils/generic.py", line 174, in return [to_py_obj(o) for o in obj] File "/home/chwang/.local/lib/python3.8/site-packages/transformers/utils/generic.py", line 175, in to_py_obj elif is_tf_tensor(obj): File "/home/chwang/.local/lib/python3.8/site-packages/transformers/utils/generic.py", line 151, in is_tf_tensor return False if not is_tf_available() else _is_tensorflow(x) File "/home/chwang/.local/lib/python3.8/site-packages/transformers/utils/generic.py", line 142, in _is_tensorflow import tensorflow as tf File "/home/chwang/.local/lib/python3.8/site-packages/tensorflow/init.py", line 37, in from tensorflow.python.tools import module_util as _module_util File "/home/chwang/.local/lib/python3.8/site-packages/tensorflow/python/ init.py", line 37, in from tensorflow.python.eager import context File "/home/chwang/.local/lib/python3.8/site-packages/tensorflow/python/eager/context.py", line 28, in from tensorflow.core.framework import function_pb2 File "/home/chwang/.local/lib/python3.8/site-packages/tensorflow/core/framework/function_pb2.py", line 5, in from google.protobuf.internal import builder as _builder ImportError: cannot import name 'builder' from 'google.protobuf.internal' (/home/chwang/.local/lib/python3.8/site-packages/google/protobuf/internal/ init.py)
— Reply to this email directly, view it on GitHub https://github.com/imClumsyPanda/langchain-ChatGLM/issues/66#issuecomment-1514464450, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLH5EQT7VXI56IALP53XMTXB6ZZTANCNFSM6AAAAAAW3M45HA . You are receiving this because you commented.Message ID: @.***>