haystack icon indicating copy to clipboard operation
haystack copied to clipboard

Bright Data web search component

Open meirk-brd opened this issue 5 months ago • 5 comments

Describe the solution you'd like Implementation of Bright Data as a web search component, the implementation can sit under : haystack/components/websearch

Describe alternatives you've considered Pepole can work with other web search tools, but for complex sites that requires JS rendering, they might get blocked, Integration with Bright Data will solve this issue.

Please let me know if you support this contribution, and if you do, I will start implementing it according to your coding conventions.

Thanks!

meirk-brd avatar Jul 08 '25 06:07 meirk-brd

Hi @meirk-brd , the idea of adding a BrightDataWebSearch sounds very good to me! I thought about whether it fits best into haystack/components/websearch or if a new integration in https://github.com/deepset-ai/haystack-core-integrations would be better. Let's go with haystack/components/websearch for now as we also have SerperDevWebSearch and SearchApiWebSearch there and I don't expect the Bright Data integration to require any additional dependencies to be installed.

Contribution guidelines are here and don't hesitate to contact us. Looking forward to your pull request!

julian-risch avatar Jul 11 '25 08:07 julian-risch

Hey @meirk-brd,

I'm happy to collaborate on this issue,

Could you kindly give on more info on how far you are on implementing it ?

RafaelJohn9 avatar Aug 14 '25 07:08 RafaelJohn9

Hi @RafaelJohn9,

We’re still on hold internally as we focus on releasing a free tier for our MCP, so we haven’t started the integration with Haystack yet.

If you’d like to help us move this forward, it would be greatly appreciated! We can provide you with credits both for testing and as a thank-you for your contribution.

Let me know if you’re interested!

meirk-brd avatar Aug 14 '25 08:08 meirk-brd

Hey @meirk-brd ,

Yes, I'm interested. I will start working on the issue then 🤝

RafaelJohn9 avatar Aug 14 '25 08:08 RafaelJohn9

Sounds great, thank you!

If you have any questions or need anything from me to help with this integration, feel free to reach out at [email protected] This is

meirk-brd avatar Aug 15 '25 12:08 meirk-brd

Hi all, @meirk-brd @julian-risch

I’d be happy to pick this up and open a PR for a BrightDataWebSearch component under haystack/components/websearch.

I’m planning to implement it using Bright Data’s SERP API via the direct API access endpoint with Authorization: Bearer <API_KEY> and a configurable SERP zone (env var), returning documents + links similar to SearchApiWebSearch and SerperDevWebSearch.

I’ll mirror the existing web search components’ API surface (e.g. top_k, allowed_domains, optional extra search params) and add unit tests with mocked responses. If you have any preferences (e.g. required parameters, naming, or whether to default to Google-only vs multi-engine) please let me know before I open the PR.

If that sounds good, I’ll start working on this and send a PR shortly.

xoaryaa avatar Dec 11 '25 10:12 xoaryaa

Hi @xoaryaa ,

This sounds great! I will contact you via email (found it in npx xoaryaa) to support you through the process, including credits.

I would highly appreciate it if in the implementation, you could follow our coding conventions: https://brightdata.com/dna/js_code And I highly suggest copying the mechanism from our MCP : https://github.com/brightdata/brightdata-mcp Specifically for serp, you can find optimization in this PR: https://github.com/brightdata/brightdata-mcp/pull/82/files

Let me know if you have any questions!

meirk-brd avatar Dec 11 '25 10:12 meirk-brd