crawl4ai 403 not authenticated

i tried running the new docker image, and i get this on the crawl endpoint but health endpoint works fine

Nov 19 '24 09:11 rafliekawangsha

getting same error

Nov 19 '24 10:11 elijahndege

@rafliekawangsha @elijahndege for the server, we set a standard API token - a token that is only available for you to use the system and platform. Because when you put it on the server in the cloud for yourself, we added this token to make sure you don't get abused by users. The token name is CRAWL4AI_API_TOKEN. When you are creating the Docker image, you can pass the environment variable and then use the same value in your code. However, if you look at the documents, it shows everything. The point is I haven't pushed the documents to the cloud today or tomorrow. I release a new version 0.3.74, and there is a very detailed explanation about how to run the whole server.

Nov 20 '24 12:11 unclecode

i also get the same error

Dec 04 '24 03:12 sjhm131

@sjhm131 @elijahndege The 403 error occurs because the crawl endpoint requires authentication. To avoid the 403 “Not authenticated” error, you need to provide a CRAWL4AI_API_TOKEN. This applies whether you’re running the server locally, using Docker, or deploying via Docker Compose. However, before I explain, let me remind you that in the new version 0.4.1, there have been some changes; the server checks if there is no API token in the environment variable, and it assumes the server is running without authentication. If you set that environment variable, it considers it in the authentication, so you are not going to face any error like that anymore. Anyway, please take a look at the explanation of how to set this environment variable.

Here are all three options:

Note:: The token referenced in this example, 'CRAWL4AI_API_TOKEN', is simply a placeholder name for a token you generate for your own local server. It is not tied to any external service or registration process. When you run the Docker setup, you're building your own instance of the Crawl for AI server. The token is created locally to help manage access to your server, such as for team communication or integration in a secure way. This token is entirely self-managed and has no association with a paid or external API service. Perhaps, I change the name into more generic form to avoid confusion.

Running Locally (No Docker):
Set the environment variable in your local shell before starting the server:
```
export CRAWL4AI_API_TOKEN="your_secret_token"
# Then start the server, e.g.:
python main.py
```

Running With Docker:
Pass the token as an environment variable when running the container:

docker run -d \
  -p 11235:11235 \
  -e CRAWL4AI_API_TOKEN="your_secret_token" \
  your_image_name

Running With Docker Compose:
In your docker-compose.yml, ensure the token is set in the environment section. For example:

services:
  base-config:
    environment:
      - CRAWL4AI_API_TOKEN=your_secret_token
      - OPENAI_API_KEY=${OPENAI_API_KEY:-}
      - CLAUDE_API_KEY=${CLAUDE_API_KEY:-}
    # ... other configs ...

Once you have this set, just run:

docker-compose up

Making Requests:
When sending requests to protected endpoints (like /crawl_direct), include the token in the Authorization header:

curl -X POST "http://localhost:11235/crawl_direct" \
     -H "Authorization: Bearer your_secret_token" \
     -H "Content-Type: application/json" \
     -d '{"url": "https://example.com"}'

This ensures the crawl endpoint will authenticate properly and you won’t receive a 403 error.

Dec 09 '24 09:12 unclecode

Sorry, there's one more question. How can I get the CRAWL4AI_API_TOKEN?? Can I freely set it up?

Dec 09 '24 10:12 sjhm131

@sjhm131 My apologies for the confusion! There’s no paid API involved, this is entirely open-source. When you run the Docker, you're creating your own server for Crawl for AI. The token is something you generate locally to manage access to your server. For example, if your team is using it within your company, they would need the token to communicate securely with the server. It’s purely self-managed and has nothing to do with us. I’ll update the name and documentation to make this clearer, and perhaps change the token name in Dockerfile and codebase to avoid any other confusion.

Dec 09 '24 13:12 unclecode