langchain icon indicating copy to clipboard operation
langchain copied to clipboard

When using embedding, the Chinese reply will be incomplete

Open lingfengchencn opened this issue 1 year ago • 2 comments

System Info

Mac vs code python :Python 3.10.11

Who can help?

No response

Information

  • [ ] The official example notebooks/scripts
  • [X] My own modified scripts

Related Components

  • [ ] LLMs/Chat Models
  • [X] Embedding Models
  • [ ] Prompts / Prompt Templates / Prompt Selectors
  • [X] Output Parsers
  • [ ] Document Loaders
  • [ ] Vector Stores / Retrievers
  • [ ] Memory
  • [ ] Agents / Agent Executors
  • [ ] Tools / Toolkits
  • [ ] Chains
  • [ ] Callbacks/Tracing
  • [ ] Async

Reproduction

  1. embedding some text into Chroma
  2. query and run load_qa_chain with OpenAI
docs = docsearch.similarity_search(query="some txt",k=2)
llm = OpenAI(
        streaming=True, 
        callbacks=[StreamingStdOutCallbackHandler()],
        temperature=0.1)
chain = load_qa_chain(llm=llm,chain_type="stuff",verbose=True)
result = chain.run(input_documents=docs,question=query,return_only_outputs=True)

  1. The result in Chinese keeps 127 ~131 words, English will finish the whole sentence.

example:

我们***是一家专注于*****机构,近些年来,我们的学员人数突破****,遍布全国***个城市,海外**个国家,这自然是我们家长对于****最好的认可。我们深知宝贝一开始有兴趣,后来因为各种的枯燥变得不愿意学了,因此,我们采用三方合作配合的模式,即家长
我们***是一家专注于*****机构,近些年来,我们的学员人数突破****,遍布全国***个城市,海外**个国家,这自然是我们家长对于***最好的认可。我们深知宝贝一开始有兴趣,后来因为各种的枯燥变得不愿意学了的顾虑,因此我们采用了一种科学的学习模式

Expected behavior

I think this was posted while working on characters, looking forward to a fix.

lingfengchencn avatar May 12 '23 11:05 lingfengchencn

Same issue here

qazs avatar May 22 '23 09:05 qazs

Found the solution, you can add a max_tokens parameters when initializing OpenAI like this:

llm = OpenAI(temperature=0, max_tokens=2048)

qazs avatar May 22 '23 14:05 qazs

tks, I upgrade langchain ,and it's gone..

lingfengchencn avatar May 23 '23 03:05 lingfengchencn

你还是把日志打印出来,看看检索到的2个片段是否是你要的答案,另外文档的分割也很重要。调整你文档的段落,重新向量化以后看看

chujian avatar May 24 '23 07:05 chujian