firecrawl icon indicating copy to clipboard operation
firecrawl copied to clipboard

[Self-Host] Client JS sdk as well as python sdk does not connect wtih http://localhost:3002 build url within the same system.

Open samyogdhital opened this issue 1 week ago • 2 comments

  • I clearly followed this self-host instruction.
  • I also cloned the .env.example from /api and did the minimal changes of PLAYWRIGHT_MICROSERVICE_URL to http://localhost:3000/scrape
  • I have shared my full .env file's code below.

Run docker compose

I then run docker compose build and docker compose up

Check the api

I then hit this below curl request,

curl -X POST http://localhost:3002/v1/crawl \
    -H 'Content-Type: application/json' \
    -H 'Authorization: Bearer YOUR_API_KEY' \
    -d '{
      "url": "https://docs.firecrawl.dev",
      "limit": 100,
      "scrapeOptions": {
        "formats": ["markdown", "html"]
      }
    }'

It ran successfully.

Connect it with python sdk and Nodejs sdk

In nodejs code, I ran this below code,

const firecrawl = new FirecrawlApp({
  apiKey: '',
  apiUrl: "http://localhost:3002",
});

but I always get error like this below.

Image

Environment (please complete the following information):

  • OS: [Windows]

Logs

PS C:\Users\user\Desktop\user\firecrawl_original> docker compose build; docker compose up
time="2025-02-13T02:30:43+05:45" level=warning msg="The \"OPENAI_BASE_URL\" variable is not set. Defaulting to a blank string."
time="2025-02-13T02:30:43+05:45" level=warning msg="The \"LOGTAIL_KEY\" variable is not set. Defaulting to a blank string."
time="2025-02-13T02:30:43+05:45" level=warning msg="The \"SERPER_API_KEY\" variable is not set. Defaulting to a blank string."
time="2025-02-13T02:30:43+05:45" level=warning msg="The \"LOGTAIL_KEY\" variable is not set. Defaulting to a blank string."
time="2025-02-13T02:30:43+05:45" level=warning msg="The \"OPENAI_BASE_URL\" variable is not set. Defaulting to a blank string."
[+] Building 4.7s (62/62) FINISHED                                                                                                                                     docker:desktop-linux
 => [playwright-service internal] load build definition from Dockerfile                                                                                                                0.0s
 => => transferring dockerfile: 1.20kB                                                                                                                                                 0.0s
 => [playwright-service internal] load metadata for docker.io/library/python:3.11-slim                                                                                                 1.9s
 => [playwright-service internal] load .dockerignore                                                                                                                                   0.0s
 => => transferring context: 2B                                                                                                                                                        0.0s
 => [playwright-service 1/6] FROM docker.io/library/python:3.11-slim@sha256:42420f737ba91d509fc60d5ed65ed0492678a90c561e1fa08786ae8ba8b52eda                                           0.0s
 => [playwright-service internal] load build context                                                                                                                                   0.0s
 => => transferring context: 242B                                                                                                                                                      0.0s
 => CACHED [playwright-service 2/6] RUN apt-get update && apt-get install -y --no-install-recommends     gcc     libstdc++6                                                            0.0s
 => CACHED [playwright-service 3/6] WORKDIR /app                                                                                                                                       0.0s
 => CACHED [playwright-service 4/6] COPY requirements.txt ./                                                                                                                           0.0s
 => CACHED [playwright-service 5/6] RUN pip install --no-cache-dir --upgrade -r requirements.txt &&     pip uninstall -y py &&     playwright install chromium && playwright install-  0.0s
 => CACHED [playwright-service 6/6] COPY . ./                                                                                                                                          0.0s
 => [playwright-service] exporting to image                                                                                                                                            0.0s
 => => exporting layers                                                                                                                                                                0.0s
 => => writing image sha256:c21039976bbd3669abe47e8dc9d5b55c7517c64753aa954555e44ac4a481a382                                                                                           0.0s
 => => naming to docker.io/library/firecrawl-playwright-service                                                                                                                        0.0s
 => [playwright-service] resolving provenance for metadata file                                                                                                                        0.0s
 => [api internal] load build definition from Dockerfile                                                                                                                               0.0s
 => => transferring dockerfile: 2.17kB                                                                                                                                                 0.0s
 => [worker internal] load metadata for docker.io/library/rust:1-bullseye                                                                                                              2.2s
 => [worker internal] load metadata for docker.io/library/golang:1.19                                                                                                                  2.2s
 => [worker internal] load metadata for docker.io/library/node:20-slim                                                                                                                 2.1s
 => [api internal] load .dockerignore                                                                                                                                                  0.0s
 => => transferring context: 77B                                                                                                                                                       0.0s
 => [api internal] load build context                                                                                                                                                  0.0s
 => => transferring context: 14.74kB                                                                                                                                                   0.0s
 => [worker go-base 1/3] FROM docker.io/library/golang:1.19@sha256:3025bf670b8363ec9f1b4c4f27348e6d9b7fec607c47e401e40df816853e743a                                                    0.0s
 => [worker rust-base 1/3] FROM docker.io/library/rust:1-bullseye@sha256:1e3f7a9fd1f278cc4be02a830745f40fe4b22f0114b2464a452c50273cc1020d                                              0.0s
 => [worker base 1/5] FROM docker.io/library/node:20-slim@sha256:5da391c4b0398f37074b6370dd9e32cebd322c2a227053ca4ae2e7a257f0be21                                                      0.0s
 => CACHED [worker base 2/5] RUN npm i -g corepack@latest                                                                                                                              0.0s
 => CACHED [worker base 3/5] RUN corepack enable                                                                                                                                       0.0s
 => CACHED [api base 4/5] COPY . /app                                                                                                                                                  0.0s
 => CACHED [api base 5/5] WORKDIR /app                                                                                                                                                 0.0s
 => CACHED [api prod-deps 1/1] RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --prod --frozen-lockfile                                                                 0.0s
 => CACHED [api stage-5 1/5] COPY --from=prod-deps /app/node_modules /app/node_modules                                                                                                 0.0s
 => CACHED [api build 1/4] RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --frozen-lockfile                                                                            0.0s
 => CACHED [api build 2/4] RUN apt-get clean && apt-get update -qq && apt-get install -y ca-certificates && update-ca-certificates                                                     0.0s
 => CACHED [api build 3/4] RUN pnpm install                                                                                                                                            0.0s
 => CACHED [api build 4/4] RUN --mount=type=secret,id=SENTRY_AUTH_TOKEN     bash -c 'export SENTRY_AUTH_TOKEN="$(cat /run/secrets/SENTRY_AUTH_TOKEN)"; if [ -z $SENTRY_AUTH_TOKEN ];   0.0s
 => CACHED [api stage-5 2/5] COPY --from=build /app /app                                                                                                                               0.0s
 => CACHED [api go-base 2/3] COPY sharedLibs/go-html-to-md /app/sharedLibs/go-html-to-md                                                                                               0.0s
 => CACHED [api go-base 3/3] RUN cd /app/sharedLibs/go-html-to-md &&     go mod tidy &&     go build -o html-to-markdown.so -buildmode=c-shared html-to-markdown.go &&     chmod +x h  0.0s
 => CACHED [api stage-5 3/5] COPY --from=go-base /app/sharedLibs/go-html-to-md/html-to-markdown.so /app/sharedLibs/go-html-to-md/html-to-markdown.so                                   0.0s
 => CACHED [api rust-base 2/3] COPY sharedLibs/html-transformer /app/sharedLibs/html-transformer                                                                                       0.0s
 => CACHED [api rust-base 3/3] RUN cd /app/sharedLibs/html-transformer &&     cargo build --release &&     chmod +x target/release/libhtml_transformer.so                              0.0s
 => CACHED [api stage-5 4/5] COPY --from=rust-base /app/sharedLibs/html-transformer/target/release/libhtml_transformer.so /app/sharedLibs/html-transformer/target/release/libhtml_tra  0.0s
 => CACHED [api stage-5 5/5] RUN sed -i 's/\r$//' /app/docker-entrypoint.sh                                                                                                            0.0s
 => [api] exporting to image                                                                                                                                                           0.0s
 => => exporting layers                                                                                                                                                                0.0s
 => => writing image sha256:4df3365deda4dec2340b87f897233ca6f30072cd9bcae0cc46d6416c8a73b970                                                                                           0.0s
 => => naming to docker.io/library/firecrawl-api                                                                                                                                       0.0s
 => [api] resolving provenance for metadata file                                                                                                                                       0.0s
 => [worker internal] load build definition from Dockerfile                                                                                                                            0.0s
 => => transferring dockerfile: 2.17kB                                                                                                                                                 0.0s
 => [worker internal] load .dockerignore                                                                                                                                               0.0s
 => => transferring context: 77B                                                                                                                                                       0.0s
 => [worker internal] load build context                                                                                                                                               0.0s
 => => transferring context: 14.74kB                                                                                                                                                   0.0s
 => CACHED [worker base 4/5] COPY . /app                                                                                                                                               0.0s
 => CACHED [worker base 5/5] WORKDIR /app                                                                                                                                              0.0s
 => CACHED [worker prod-deps 1/1] RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --prod --frozen-lockfile                                                              0.0s
 => CACHED [worker stage-5 1/5] COPY --from=prod-deps /app/node_modules /app/node_modules                                                                                              0.0s
 => CACHED [worker build 1/4] RUN --mount=type=cache,id=pnpm,target=/pnpm/store pnpm install --frozen-lockfile                                                                         0.0s
 => CACHED [worker build 2/4] RUN apt-get clean && apt-get update -qq && apt-get install -y ca-certificates && update-ca-certificates                                                  0.0s
 => CACHED [worker build 3/4] RUN pnpm install                                                                                                                                         0.0s
 => CACHED [worker build 4/4] RUN --mount=type=secret,id=SENTRY_AUTH_TOKEN     bash -c 'export SENTRY_AUTH_TOKEN="$(cat /run/secrets/SENTRY_AUTH_TOKEN)"; if [ -z $SENTRY_AUTH_TOKEN   0.0s
 => CACHED [worker stage-5 2/5] COPY --from=build /app /app                                                                                                                            0.0s
 => CACHED [worker go-base 2/3] COPY sharedLibs/go-html-to-md /app/sharedLibs/go-html-to-md                                                                                            0.0s
 => CACHED [worker go-base 3/3] RUN cd /app/sharedLibs/go-html-to-md &&     go mod tidy &&     go build -o html-to-markdown.so -buildmode=c-shared html-to-markdown.go &&     chmod +  0.0s
 => CACHED [worker stage-5 3/5] COPY --from=go-base /app/sharedLibs/go-html-to-md/html-to-markdown.so /app/sharedLibs/go-html-to-md/html-to-markdown.so                                0.0s
 => CACHED [worker rust-base 2/3] COPY sharedLibs/html-transformer /app/sharedLibs/html-transformer                                                                                    0.0s
 => CACHED [worker rust-base 3/3] RUN cd /app/sharedLibs/html-transformer &&     cargo build --release &&     chmod +x target/release/libhtml_transformer.so                           0.0s
 => CACHED [worker stage-5 4/5] COPY --from=rust-base /app/sharedLibs/html-transformer/target/release/libhtml_transformer.so /app/sharedLibs/html-transformer/target/release/libhtml_  0.0s
 => CACHED [worker stage-5 5/5] RUN sed -i 's/\r$//' /app/docker-entrypoint.sh                                                                                                         0.0s
 => [worker] exporting to image                                                                                                                                                        0.0s
 => => exporting layers                                                                                                                                                                0.0s
 => => writing image sha256:830a87784d98838b8d312ad0be37eb1e7d0e164846383ba51a6cabb0097082ea                                                                                           0.0s
 => => naming to docker.io/library/firecrawl-worker                                                                                                                                    0.0s
 => [worker] resolving provenance for metadata file                                                                                                                                    0.0s
time="2025-02-13T02:30:48+05:45" level=warning msg="The \"SERPER_API_KEY\" variable is not set. Defaulting to a blank string."
time="2025-02-13T02:30:48+05:45" level=warning msg="The \"OPENAI_BASE_URL\" variable is not set. Defaulting to a blank string."
time="2025-02-13T02:30:48+05:45" level=warning msg="The \"LOGTAIL_KEY\" variable is not set. Defaulting to a blank string."
time="2025-02-13T02:30:48+05:45" level=warning msg="The \"OPENAI_BASE_URL\" variable is not set. Defaulting to a blank string."
time="2025-02-13T02:30:48+05:45" level=warning msg="The \"LOGTAIL_KEY\" variable is not set. Defaulting to a blank string."
[+] Running 4/4
 ✔ Container firecrawl-redis-1               Created                                                                                                                                   0.0s
 ✔ Container firecrawl-playwright-service-1  Created                                                                                                                                   0.0s
 ✔ Container firecrawl-worker-1              Recreated                                                                                                                                 0.1s
 ✔ Container firecrawl-api-1                 Recreated                                                                                                                                 0.1s
Attaching to api-1, playwright-service-1, redis-1, worker-1
redis-1               | 1:C 12 Feb 2025 20:45:49.592 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis-1               | 1:C 12 Feb 2025 20:45:49.592 * Redis version=7.4.2, bits=64, commit=00000000, modified=0, pid=1, just started
redis-1               | 1:C 12 Feb 2025 20:45:49.592 * Configuration loaded
redis-1               | 1:M 12 Feb 2025 20:45:49.592 * monotonic clock: POSIX clock_gettime
redis-1               | 1:M 12 Feb 2025 20:45:49.593 * Running mode=standalone, port=6379.
redis-1               | 1:M 12 Feb 2025 20:45:49.593 * Server initialized
redis-1               | 1:M 12 Feb 2025 20:45:49.593 * Loading RDB produced by version 7.4.2
redis-1               | 1:M 12 Feb 2025 20:45:49.593 * RDB age 28 seconds
redis-1               | 1:M 12 Feb 2025 20:45:49.593 * RDB memory usage when created 1.51 Mb
redis-1               | 1:M 12 Feb 2025 20:45:49.593 * Done loading RDB, keys loaded: 3, keys expired: 0.
redis-1               | 1:M 12 Feb 2025 20:45:49.593 * DB loaded from disk: 0.000 seconds
redis-1               | 1:M 12 Feb 2025 20:45:49.593 * Ready to accept connections tcp
api-1                 | NEW ULIMIT: 65535
api-1                 | RUNNING app
worker-1              | NEW ULIMIT: 65535
worker-1              | RUNNING worker
api-1                 | 2025-02-12 20:45:50 warn [:]: Authentication is disabled. Supabase client will not be initialized. {}
worker-1              | 2025-02-12 20:45:50 warn [:]: Authentication is disabled. Supabase client will not be initialized. {}
api-1                 | 2025-02-12 20:45:51 warn [:]: POSTHOG_API_KEY is not provided - your events will not be logged. Using MockPostHog as a fallback. See posthog.ts for more. {}
worker-1              | 2025-02-12 20:45:51 warn [:]: POSTHOG_API_KEY is not provided - your events will not be logged. Using MockPostHog as a fallback. See posthog.ts for more. {}
api-1                 | 2025-02-12 20:45:51 info [:]: Number of CPUs: 16 available
api-1                 | 2025-02-12 20:45:51 info [:]: Web scraper queue created
api-1                 | 2025-02-12 20:45:51 info [:]: Extraction queue created
api-1                 | 2025-02-12 20:45:51 info [:]: Index queue created
api-1                 | 2025-02-12 20:45:51 info [:]: Worker 10 started
api-1                 | 2025-02-12 20:45:51 info [:]: Worker 10 listening on port 3002
api-1                 | 2025-02-12 20:45:51 info [:]: For the Queue UI, open: http://0.0.0.0:3002/admin/@/queues
api-1                 | 2025-02-12 20:45:51 info [:]: Connected to Redis Session Rate Limit Store!
playwright-service-1  | [2025-02-12 20:45:51 +0000] [9] [INFO] Running on http://[::]:3000 (CTRL + C to quit)
worker-1              | 2025-02-12 20:45:51 info [:]: Web scraper queue created
worker-1              | 2025-02-12 20:45:51 info [:]: Extraction queue created
worker-1              | 2025-02-12 20:45:51 info [:]: Connected to Redis Session Rate Limit Store!

And also my question is, If I turn USE_DB_AUTHENTICATION=true and also provide value of these 3 variables SUPABASE_ANON_TOKEN, SUPABASE_URL, SUPABASE_SERVICE_TOKEN and also provide value of TEST_API_KEY=test, then when calling the api and providing api_key, do I have to provide api_key="fc-test" or api_key="test"?

I tried with both, but I was getting not authenticated error in postman while hitting this crawl api http://localhost:3002/v1/crawl api even after providing accurate values to the required environement variables.

I have shared the api testing ss as well where I am getting unauthorized error.

Image

Image

samyogdhital avatar Feb 12 '25 21:02 samyogdhital