langchain
langchain copied to clipboard
Add top K as Self Query Retriever Get Documents Input
Add Top K as Optional Parameter for Self Query Retriever
Adding k as a parameter to Self-Retriever's Get Relevant Documents method. This allows users to specify how many documents they want it to retrieve during a search. This is a fixed PR from previous PR: https://github.com/hwchase17/langchain/pull/4297
Who can review?
Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:
@hwchase17 @dev2049
Two questions to figure out here:
-
We don't have benchmarks anywhere. Adding another parameter makes the task more complex which means that performance will likely decrease. It might be by a tiny amount, but we don't know.
-
(Less important) the existing parser can be generalized right now to yield a single "program" rather than be broken into a query,
k, and aprogramfor filtering. It'll clean up some of the code, but same bench marking problem. This is less of an issue since the implementation can be refactored at later time.
@hwchase17 @dev2049
Two questions to figure out here:
- We don't have benchmarks anywhere. Adding another parameter makes the task more complex which means that performance will likely decrease. It might be by a tiny amount, but we don't know.
- (Less important) the existing parser can be generalized right now to yield a single "program" rather than be broken into a query,
k, and aprogramfor filtering. It'll clean up some of the code, but same bench marking problem. This is less of an issue since the implementation can be refactored at later time.
Are there existing benchmarks for other retrievers? Do you want me to look into adding one?
@eyurtsev @hwchase17 any feedback for this? It would be really helpful to add this parameter so I can use Self-Retriever in my application
@hwchase17 @dev2049
Two questions to figure out here:
- We don't have benchmarks anywhere. Adding another parameter makes the task more complex which means that performance will likely decrease. It might be by a tiny amount, but we don't know.
- (Less important) the existing parser can be generalized right now to yield a single "program" rather than be broken into a query,
k, and aprogramfor filtering. It'll clean up some of the code, but same bench marking problem. This is less of an issue since the implementation can be refactored at later time.
i think its actually beneficial to have k separate - otherwise it will get bundled up with other parameters. unless we want to do some complicated decoding....