firecrawl
firecrawl copied to clipboard
[Bug] timeout parameter not passed to playwright service
Describe the Bug
The timeout
parameter is not passed to playwright-service.
To Reproduce Steps to reproduce the issue:
- Configure the docker-compose setup with the python-based microservice:
services:
playwright-service:
build: apps/playwright-service
.env:
PLAYWRIGHT_MICROSERVICE_URL: http://playwright-service:3000/html
- Send an API request:
{
"url": "https://[removed]",
"timeout": 60000,
"waitFor": 30000,
"formats": [
"markdown"
]
}
- Observe that the request sent to the microservice omits the
timeout
:
POST /html HTTP/1.1
Host: playwright-service:3000
[..]
{"url":"[removed]","wait_after_load":30000}
The log displays an error with the default timeout of 15000ms:
playwright-service-1 | playwright._impl._errors.TimeoutError: Page.goto: Timeout 15000ms exceeded.
playwright-service-1 | Call log:
playwright-service-1 | navigating to "https://[removed]", waiting until "load"
Expected Behavior
The timeout is passed to the playwright-service and used for Page.goto
Additional Context
The service expects a timeout
parameter in the body:
https://github.com/mendableai/firecrawl/blob/a40fb3b062dfee4d1dd79c4c4946f2f418da32c7/apps/playwright-service/main.py#L91-L95
The playwright integration is not passing the parameter: https://github.com/mendableai/firecrawl/blob/a40fb3b062dfee4d1dd79c4c4946f2f418da32c7/apps/api/src/scraper/WebScraper/scrapers/playwright.ts#L38-L44
The suggested fix would be passing that parameter in the integration.