storm icon indicating copy to clipboard operation
storm copied to clipboard

Alternatives to You.com search API

Open songkq opened this issue 10 months ago • 14 comments

@shaoyijia @Yucheng-Jiang Hi, I'm wondering if there are some other search APIs can be used for storm since that You.com API requires a credit card. https://github.com/stanford-oval/storm/blob/42f4d5bbbaca67bc2e4e8ea5814e0975fef971fc/src/modules/topic_expert.py#L79

For example, can these search APIs provided by langchain be used for alternatives to You.com.

  • https://github.com/langchain-ai/langchain/blob/6dc4f592ba62fef08ba6bb832b7b6a4ef578f327/libs/community/langchain_community/utilities/duckduckgo_search.py#L11

  • https://github.com/langchain-ai/langchain/blob/6dc4f592ba62fef08ba6bb832b7b6a4ef578f327/libs/community/langchain_community/utilities/bing_search.py#L13

  • https://github.com/langchain-ai/langchain/blob/6dc4f592ba62fef08ba6bb832b7b6a4ef578f327/libs/community/langchain_community/utilities/brave_search.py#L9

  • https://github.com/langchain-ai/langchain/blob/6dc4f592ba62fef08ba6bb832b7b6a4ef578f327/libs/community/langchain_community/utilities/tavily_search.py#L16

  • https://github.com/langchain-ai/langchain/blob/6dc4f592ba62fef08ba6bb832b7b6a4ef578f327/libs/community/langchain_community/utilities/searx_search.py

  • https://github.com/langchain-ai/langchain/blob/6dc4f592ba62fef08ba6bb832b7b6a4ef578f327/libs/community/langchain_community/utilities/searchapi.py#L9

songkq avatar Apr 13 '24 10:04 songkq

Hi, thanks for your interest!

We view this project as an example of a knowledge curation engine that serves as the intermediate layer between vase unstructured information and human. So, supporting different information sources is in our plan.

For the pointers you provide, are you willing to open a PR for integration? Happy to help merge it.

shaoyijia avatar Apr 13 '24 18:04 shaoyijia

Yeah, I'll try to integrate more search APIs into storm.

songkq avatar Apr 14 '24 13:04 songkq

Great, thank you!

shaoyijia avatar Apr 14 '24 16:04 shaoyijia

If we have plans and a to-do list, I'd like to claim some tasks to help.

LronDC avatar Apr 15 '24 08:04 LronDC

@LronDC Thank you for your interest in our project! We're currently working on an upcoming code release that will enhance the scalability of the project. We will keep you updated and soon share some potential tasks where the community can contribute. Stay tuned!

Yucheng-Jiang avatar Apr 15 '24 20:04 Yucheng-Jiang

@shaoyijia Please review this pull request https://github.com/stanford-oval/storm/pull/20.

  1. Support DuckDuckGoSearchAPI and TavilySearchAPI as Alternatives to You.com.
  2. When enabling TopicExpert to use DuckDuckGoSearchAPI or TavilySearchAPI, these APIs will return compelete contents instead of snippets as default.
  3. One can setup the search API through editing these environments in secrets.toml: Set WEB_SEARCH_API as one of ['DuckDuckGoSearchAPI', 'TavilySearchAPI', 'YouSearchAPI'], using YouSearchAPI as default Setup You.com search API key by YDC_API_KEY=<your_youcom_api_key> Setup api.tavily.com search API key by TAVILY_API_KEY=<your_api_tavily_com_key>

songkq avatar Apr 16 '24 13:04 songkq

@shaoyijia Considering supporting different information sources, I recommend you to use our open-source project, i.e., QAnything. QAnything is a local knowledge base question-answering system designed to support a wide range of file formats and databases.

songkq avatar Apr 17 '24 01:04 songkq

sadly, we cannot access to You.com。

dl942702882 avatar Apr 17 '24 06:04 dl942702882

@dl942702882 You.com offers free tier of api quota. It’s sufficient to write more than 25 articles locally.

Yucheng-Jiang avatar Apr 17 '24 06:04 Yucheng-Jiang

@dl942702882 , for switching to customized sources (before we support this officially), maybe you can check out what this PR (#20) tries to do?

shaoyijia avatar Apr 17 '24 06:04 shaoyijia

An update in this thread:

We just release the refactored code to make it easier to run/customize/develop the STORM engine. Now, search API, retrieval model integration in src/rm.py. The knowledge curation engine will directly consume the Information output by Retriever.

shaoyijia avatar Apr 23 '24 06:04 shaoyijia

@LronDC @songkq , we are now specifically interested in supporting:

  1. Retrieval models that can retrieve information from customized source.
  2. Search API that return academic sources, e.g., Semantic Scholar API

Contribution is highly appreciated if you are interested!

shaoyijia avatar Apr 23 '24 06:04 shaoyijia

@shaoyijia Hi, I'll support the Semantic Scholar API soon after the API key is obtained.

songkq avatar Apr 23 '24 07:04 songkq

Hi @songkq , thank you so much! I have a Semantic Scholar API so can also test it.

shaoyijia avatar Apr 23 '24 15:04 shaoyijia

@songkq now we support with more retrieval methods, see documentation here: https://github.com/stanford-oval/storm?tab=readme-ov-file#api.

In addition to You.com, we also support Bing search, and customized corpus retrieval with vector database.

Yucheng-Jiang avatar Jul 18 '24 04:07 Yucheng-Jiang