langchain
langchain copied to clipboard
LLMRouterChain uses deprecated predict_and_parse method
System Info
langchain v0.0.216, Python 3.11.3 on WSL2
Who can help?
@hwchase17
Information
- [X] The official example notebooks/scripts
- [ ] My own modified scripts
Related Components
- [ ] LLMs/Chat Models
- [ ] Embedding Models
- [X] Prompts / Prompt Templates / Prompt Selectors
- [ ] Output Parsers
- [ ] Document Loaders
- [ ] Vector Stores / Retrievers
- [ ] Memory
- [ ] Agents / Agent Executors
- [ ] Tools / Toolkits
- [X] Chains
- [ ] Callbacks/Tracing
- [ ] Async
Reproduction
Follow the first example at https://python.langchain.com/docs/modules/chains/foundational/router
Expected behavior
This line gets triggered:
The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.
As suggested by the error, we can make the following code changes to pass the output parser directly to LLMChain by changing this line to this:
llm_chain = LLMChain(llm=llm, prompt=prompt, output_parser=prompt.output_parser)
And calling LLMChain.__call__
instead of LLMChain.predict_and_parse
by changing these lines to this:
cast(
Dict[str, Any],
self.llm_chain(inputs, callbacks=callbacks),
)
Unfortunately, while this avoids the warning, it creates a new error:
ValueError: Missing some output keys: {'destination', 'next_inputs'}
because LLMChain currently assumes the existence of a single self.output_key
and produces this as output:
{'text': {'destination': 'physics', 'next_inputs': {'input': 'What is black body radiation?'}}}
Even modifying that function to return the keys if the parsed output is a dict triggers the same error, but for the missing key of "text" instead. predict_and_parse
avoids this fate by skipping output validation entirely.
It appears changes may have to be a bit more involved here if LLMRouterChain is to keep using LLMChain.
I am seeing the same issue on v0.0.221 , Python 3.10.6 Windows 10
I have the same issue on v0.0.229, Python v3.10.12
I have the same issue on v0.0.230, Python v3.10.6 Windows 11
same issue here
langchain v0.0.232
[/opt/homebrew/lib/python3.11/site-packages/langchain/chains/llm.py:275](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.11/site-packages/langchain/chains/llm.py:275): UserWarning: The predict_and_parse method is deprecated, instead pass an output parser directly to LLMChain.
langchain v0.0.235, Python 3.9.17 Windows 10
langchain v0.0.240, Python 3.10.10 macOS Ventura
langchain v0.0.244, Python 3.10.11 Windows 10
I am seeing the same issue on v0.0.257 Python 3.9.12 RedHat
same issue v0.0.270 python 3.11.3 windows
Same warning on v0.0.275 python 3.11.3 WSL on Windows 11
I also get the same error when using this option:
print(compressed_docs[0].page_content)
If I remove the page_content method, I do not get this error. Also if I use this (CharacterTextSplitter), I do not see the error
text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=500)
Here is a sample function where this happens:
def demo(question):
'''
Follow the steps below to fill out this function:
'''
# PART ONE:
loader = TextLoader('/map/to/document/data.txt', encoding='utf8')
documents = loader.load()
# PART TWO
# Split the document into chunks (you choose how and what size)
# text_splitter = CharacterTextSplitter.from_tiktoken_encoder(chunk_size=1000)
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
chunk_size=1000, chunk_overlap=100, separators=[" ", ",", "\n"])
docs = text_splitter.split_documents(documents)
# PART THREE
# EMBED THE Documents (now in chunks) to a persisted ChromaDB
embedding_function = OpenAIEmbeddings()
db = Chroma.from_documents(docs, embedding_function, persist_directory='./App')
db.persist
# PART FOUR
# Use ChatOpenAI and ContextualCompressionRetriever to return the most
# relevant part of the documents.
llm = ChatOpenAI(temperature=0)
compressor = LLMChainExtractor.from_llm(llm)
compression_retreiver = ContextualCompressionRetriever(base_compressor=compressor,
base_retriever=db.as_retriever())
compressed_docs=compression_retreiver.get_relevant_documents(question)
print(compressed_docs[0].page_content)
y'all any fix to it? What are we supposed to do
Got the same warning while using load_qa_chain chain_type='map_rerank',
I posted a very similar issue #10462, using SelfQueryRetriever
. I received some sort of solution from the chatbot. However, I don't have a clue how to implement it.
I posted a very similar issue #10462, using
SelfQueryRetriever
. I received some sort of solution from the chatbot. However, I don't have a clue how to implement it.
I solved it by extending SelfQueryRetriever
and overwriting the _get_relevant_documents
method. Below is an example (In this example, I overwrite _aget_relevant_documents
, because I need async features in my case. You can do the same to _get_relevant_documents
. ):
class AsyncSelfQueryRetriever(SelfQueryRetriever):
async def _aget_relevant_documents(
self, query: str, *, run_manager: AsyncCallbackManagerForRetrieverRun
) -> List[Document]:
"""Asynchronously get documents relevant to a query.
Args:
query: String to find relevant documents for
run_manager: The callbacks handler to use
Returns:
List of relevant documents
"""
inputs = self.llm_chain.prep_inputs({"query": query})
structured_query = cast(
StructuredQuery,
# Instead of calling 'self.llm_chain.predict_and_parse' here,
# I changed it to leveraging 'self.llm_chain.prompt.output_parser.parse'
# and 'self.llm_chain.apredict'
# ↓↓↓↓↓↓↓
self.llm_chain.prompt.output_parser.parse(
await self.llm_chain.apredict(
callbacks=run_manager.get_child(), **inputs
)
),
)
if self.verbose:
print(structured_query)
new_query, new_kwargs = self.structured_query_translator.visit_structured_query(
structured_query
)
if structured_query.limit is not None:
new_kwargs["k"] = structured_query.limit
if self.use_original_query:
new_query = query
search_kwargs = {**self.search_kwargs, **new_kwargs}
docs = await self.vectorstore.asearch(
new_query, self.search_type, **search_kwargs
)
return docs
I posted a very similar issue #10462, using
SelfQueryRetriever
. I received some sort of solution from the chatbot. However, I don't have a clue how to implement it.I solved it by extending
SelfQueryRetriever
and overwriting the_get_relevant_documents
method. Below is an example (In this example, I overwrite_aget_relevant_documents
, because I need async features in my case. You can do the same to_get_relevant_documents
. ):
Hi! @ellisxu Very appreciated with your sharing! Can you show the packages which are imported from? Thanks a lot!
I posted a very similar issue #10462, using
SelfQueryRetriever
. I received some sort of solution from the chatbot. However, I don't have a clue how to implement it.I solved it by extending
SelfQueryRetriever
and overwriting the_get_relevant_documents
method. Below is an example (In this example, I overwrite_aget_relevant_documents
, because I need async features in my case. You can do the same to_get_relevant_documents
. ):Hi! @ellisxu Very appreciated with your sharing! Can you show the packages which are imported from? Thanks a lot!
The package is from langchain.retrievers.self_query.base import SelfQueryRetriever
and the langchain version I use is 0.0.279.
I am still running into this with SelfQueryRetriever
using langchain==0.0.302
....is there any resolution here?
I am still running into this with
SelfQueryRetriever
usinglangchain==0.0.302
....is there any resolution here?
https://github.com/langchain-ai/langchain/issues/6819#issuecomment-1720942610 Try this. :)
This is unfortunate as this part of Lang Chain is used in the DeepLearningAI course