suna icon indicating copy to clipboard operation
suna copied to clipboard

feat: add integration of ScrapeGraphAI

Open VinciGit00 opened this issue 11 months ago • 20 comments

This PR will add the integration of ScrapeGraphAI (url)

This PR will:

  • reduce the API cost 30% using SGAI instead of Firecrawl
  • Increase the speed of response of 25%
  • Add proxy rotation and better parallel handling of the HTTP requests

VinciGit00 avatar May 08 '25 18:05 VinciGit00

@VinciGit00 is attempting to deploy a commit to the projects Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar May 08 '25 18:05 vercel[bot]

In my opinion, ScrapeGraph is more suitable than Firecrawl for this purpose

banuakkus8 avatar May 08 '25 18:05 banuakkus8

I think the use of ScrapeGraph could bring advantages for this use case

DPende avatar May 08 '25 18:05 DPende

I really think that this would enhance the experience compared to firecrawl.

MarchesiGabriele avatar May 08 '25 18:05 MarchesiGabriele

Looking forward to this ScarpegraphAI integration, I think is better suited compared to Firecrawl

vedovati-matteo avatar May 08 '25 19:05 vedovati-matteo

I was really looking forward to this integration – thank you for making the changes! 🔥 🔥

giorgioberardini1 avatar May 08 '25 20:05 giorgioberardini1

Is ScrapeGraph self-hostable? Because Ideally I want to replace with a in-container scraper

markokraemer avatar May 09 '25 00:05 markokraemer

Is ScrapeGraph self-hostable? Because Ideally I want to replace with a in-container scraper

looks like it https://github.com/ScrapeGraphAI/Scrapegraph-ai

oldschoola avatar May 09 '25 01:05 oldschoola

I think this integration will help a lot of people , this is much better than using firecrawl

Vikrant-Khedkar avatar May 09 '25 04:05 Vikrant-Khedkar

Is ScrapeGraph self-hostable? Because Ideally I want to replace with a in-container scraper

No for the moment

VinciGit00 avatar May 09 '25 07:05 VinciGit00

I think the use of ScrapeGraph could bring advantages for this use case

scaliseraoul avatar May 09 '25 07:05 scaliseraoul

Is ScrapeGraph self-hostable? Because Ideally I want to replace with a in-container scraper

looks like it

https://github.com/ScrapeGraphAI/Scrapegraph-ai

It does not scale for high traffic

VinciGit00 avatar May 09 '25 14:05 VinciGit00

ScrapeGraphAI would be a good choice for this use case, being cheaper and faster than FireCrawl

f-aguzzi avatar May 09 '25 21:05 f-aguzzi

I think the use of ScrapeGraph could bring advantages for this use case

I am thinking to include within the sandbox, so 1 agent would have it spun up with its instance as its own scraper

markokraemer avatar May 10 '25 02:05 markokraemer

@markokraemer tell me if you need more infos

VinciGit00 avatar May 10 '25 07:05 VinciGit00

Can u PR it as a seperate tool scrape_graph_search_tool.py or similar, we will introduce modular tools to enable / disable - then we will have it as an option to choose from?

markokraemer avatar May 13 '25 23:05 markokraemer

Hi @markokraemer,

I updated it

VinciGit00 avatar May 14 '25 08:05 VinciGit00

I meant just adding it as a separate tool and keeping the current web-search tool as is.

markokraemer avatar May 16 '25 15:05 markokraemer

We are adding a Tool Marketplace so we will have in there as well, so you can choose & configure what u want to use

markokraemer avatar May 16 '25 15:05 markokraemer

ok where is the tool marketplace?

VinciGit00 avatar May 16 '25 15:05 VinciGit00

:/

VinciGit00 avatar Sep 24 '25 13:09 VinciGit00