firecrawl icon indicating copy to clipboard operation
firecrawl copied to clipboard

[Self-Host] fetch engine does not support proxy settings

Open mschfh opened this issue 1 month ago • 2 comments

Describe the Issue The fetch scraper does not support proxy settings.

Expected Behavior The fetch scraper should use the same proxy settings as the playwright-service:

PROXY_SERVER=
PROXY_USERNAME=
PROXY_PASSWORD=

Environment (please complete the following information):

  • N/A

Logs

worker-1              | 2025-01-03 16:30:37 info [ScrapeURL:]: Scraping via playwright...
worker-1              | 2025-01-03 16:30:43 info [ScrapeURL:]: An unexpected error happened while scraping with playwright.
worker-1              | 2025-01-03 16:30:43 info [ScrapeURL:]: Scraping via fetch...
worker-1              | 2025-01-03 16:30:43 info [ScrapeURL:]: Scrape via fetch deemed successful.

Configuration N/A

Additional Context

This should be fixable by adding a ProxyAgent and passing it via the dispatcher parameter here: https://github.com/mendableai/firecrawl/blob/87757d9b8e6bacc658b48832deb47c51eaf7412a/apps/api/src/scraper/scrapeURL/engines/fetch/index.ts#L17C7-L20

mschfh avatar Jan 03 '25 16:01 mschfh