langchain icon indicating copy to clipboard operation
langchain copied to clipboard

search helper with Searx API

Open blob42 opened this issue 1 year ago • 5 comments

I am implementing in #854 a helper search api for searx which is a famous self hosted meta search engine. This will offer the possibility to use search without relying on google or any paid APIs.

I started this issue to get some early feedback

blob42 avatar Feb 02 '23 22:02 blob42

i like the idea - im not super familiar with how to setup/use searx so some documentation there may be helpful

hwchase17 avatar Feb 03 '23 05:02 hwchase17

I am almost done with the base features of the module. I made it mirror the usage of the Google Search utility, but since it outputs results in JSON and can provide answers it is similar also to SerpAPI in some regards.

As time goes on, I have no doubt that more and more people will turn to meta search engines like Searx, as they offer an alternative to paid and restricted options like Google. It's also interesting to note that there is currently a discussion taking place in the SearxNG community regarding the integration of large language models (LLMs).

I started working on the documentation to provide some examples of usage, I managed to produce the same results using the zero-shot-react example with the added advantage of the possibility to select which backend engine to use when the search is done. In my case, by default agent was not getting the correct age of the boyfriend, I switched the engine to qwant and it provided the correct answers.

blob42 avatar Feb 06 '23 21:02 blob42

Hi @blob42 , thanks for you great work on adding searx. I am new to Searx and I have been encountering 403 errors while using the Searx wrapper. Here's the error message I am receiving:

ValueError: ('Searx API returned an error: ', '<!doctype html>\n\n

403 Forbidden\n

Forbidden

\n

You don't have the permission to access the requested resource. It is either read-protected or not readable by the server.

\n')

I have tried using different Searx hosts, both public and local, but the issue persists. While my local instance "127.0.0.1:8888" works fine on my browser, it does not work with the SearxSearchWrapper.

I would appreciate any help you can provide to resolve this issue. I apologize if this is a basic question, as I am new to Searx.

yaodongC avatar Mar 03 '23 02:03 yaodongC

Hi @yaodongC . There should be two reasons for this problem.

  1. You need to enable the json output format in the searx configuration under the search.formats key like this:
search:
  formats:
    - html
    - json
  1. You need to disable the rate limiter plugin (should be disabled by default) on your self hosted instance or use the patch from this pr. This is also why it's not recommended to use public instances as they all have the rater limiter plugin and disabled json format.

Let me know if this solves the issue

blob42 avatar Mar 03 '23 03:03 blob42

Thank you so much @blob42 ! By adding - json in format, the problem is solved!!

yaodongC avatar Mar 03 '23 03:03 yaodongC

also had this issue, and this solved it for me. I was using searxng app in my unraid nas, and the setting is not in the preferences gui so far as I could tell, but I was able to open a terminal, and edit to the specs above, and restart the docker instance, and all was well.

EnviralDesign avatar Apr 07 '23 23:04 EnviralDesign

Hi, @blob42! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you were implementing a search helper API for the self-hosted meta search engine, Searx. You were seeking early feedback on your implementation. There have been some important developments in the comments. hwchase17 liked the idea and suggested adding documentation. You provided an update on the progress and mentioned a discussion in the SearxNG community about integrating large language models.

Additionally, yaodongC encountered a 403 error while using the Searx wrapper and asked for help. You provided a solution by enabling the json output format and disabling the rate limiter plugin. Both yaodongC and EnviralDesign confirmed that the solution worked for them.

Now, we would like to know if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your contribution!

dosubot[bot] avatar Sep 21 '23 16:09 dosubot[bot]