crawl4ai icon indicating copy to clipboard operation
crawl4ai copied to clipboard

[Bug]: ImportError: cannot import name 'CrawlerRunConfig' from 'crawl4ai' (/app/crawl4ai/__init__.py)

Open mirozbiro opened this issue 10 months ago • 2 comments

crawl4ai version

crawl4ai-0.4.248

Expected Behavior

To be able to import CrawlerRunConfig as per examples in https://docs.crawl4ai.com/extraction/no-llm-strategies/

Current Behavior

When the script initializes and tries to import CrawlerRunConfig, import fails.

Is this reproducible?

Yes

Inputs Causing the Bug

Issue happens during startup when import happens.

Steps to Reproduce

To reproduce, you can just run the given Dockerfile commenting out these 2 lines:
RUN pip install -U crawl4ai
RUN crawl4ai-doctor

I added the above 2 lines to try to upgrade to latest since i was unable to import class "JsonXPathExtractionStrategy" and found "JsonXPATHExtractionStrategy" instead. 

So When adding the above 2 lines, during the build up, I saw this when the pip tried to install:

Attempting uninstall: crawl4ai
    Found existing installation: Crawl4AI 0.3.745
    Uninstalling Crawl4AI-0.3.745:
      Successfully uninstalled Crawl4AI-0.3.745
Successfully installed cffi-1.17.1 crawl4ai-0.4.248 

This confirmed that the docker image "unclecode/crawl4ai:all-amd64" is older version which made me to try to upgrade. However even after that, import is failing.

Code snippets

###################################################################################
# Dockerfile
FROM unclecode/crawl4ai:all-amd64

WORKDIR /app

# Install required packages  
RUN apt-get update && apt-get install -y python3 python3-pip \
     python3-venv git xvfb libpq-dev gcc \ 
     xvfb fluxbox x11vnc && apt-get clean && rm -rf /var/lib/apt/lists/*

# Set the DISPLAY variable to use the virtual display.
ENV DISPLAY=:99

# Create the session directory
RUN mkdir -p /app/session

# Copy the entrypoint script into the container.
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh

# Expose the VNC & API port so you can connect from outside the container.
EXPOSE 5900 
EXPOSE 8080 

# Copy the rest of the project files
COPY . .

# Install Python packages
RUN pip install --upgrade pip && pip install --no-cache-dir -r requirements.txt

# just added this as part of testing since it seems that docker image was little behind
RUN pip install -U crawl4ai
RUN crawl4ai-doctor

# Start Xvfb, a simple window manager (fluxbox), and x11vnc.
# Then, run your application (for example, "python /app/crawler.py").
CMD ["/entrypoint.sh"]

#################### DOckerfile end ###############################################


######################## docker-compose.yaml #############################
crawl4ai:
    build: 
      context: ./crawler-app
      dockerfile: Dockerfile
    env_file:
      - .env
    environment:
      - POSTGRES_DB_MAIN=${POSTGRES_DB_MAIN}
      - POSTGRES_CRAWL4AI_USER=${POSTGRES_CRAWL4AI_USER}
      - POSTGRES_CRAWL4AI_PASSWORD=${POSTGRES_CRAWL4AI_PASSWORD}
      - POSTGRES_HOST=${POSTGRES_HOST}
      - POSTGRES_PORT=${POSTGRES_DB}
      # LLM Provider Keys
      #- OPENAI_API_KEY=${OPENAI_API_KEY:-}
      #- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
    #restart: "unless-stopped"
    depends_on:
      - postgres-db
    ports:
    #  - "11235:11235"
    #  - "5900:5900"
      - "8080:8080"
    volumes:
      - /dev/shm:/dev/shm
    deploy:
      resources:
        limits:
          memory: 4G
        reservations:
          memory: 1G

OS

Ubuntu 22.04 LTS running Docker

Python version

python3

Browser

NA

Browser version

NA

Error logs & Screenshots (if applicable)

Traceback (most recent call last):
  File "/app/crawler.py", line 4, in <module>
    from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
ImportError: cannot import name 'CrawlerRunConfig' from 'crawl4ai' (/app/crawl4ai/__init__.py)
Traceback (most recent call last):
  File "/usr/local/bin/uvicorn", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/uvicorn/main.py", line 412, in main
    run(
  File "/usr/local/lib/python3.10/site-packages/uvicorn/main.py", line 579, in run
    server.run()
  File "/usr/local/lib/python3.10/site-packages/uvicorn/server.py", line 65, in run
    return asyncio.run(self.serve(sockets=sockets))
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/site-packages/uvicorn/server.py", line 69, in serve
    await self._serve(sockets)
  File "/usr/local/lib/python3.10/site-packages/uvicorn/server.py", line 76, in _serve
    config.load()
  File "/usr/local/lib/python3.10/site-packages/uvicorn/config.py", line 434, in load
    self.loaded_app = import_from_string(self.app)
  File "/usr/local/lib/python3.10/site-packages/uvicorn/importer.py", line 19, in import_from_string
    module = importlib.import_module(module_str)
  File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/app/crawler.py", line 4, in <module>
    from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
ImportError: cannot import name 'CrawlerRunConfig' from 'crawl4ai' (/app/crawl4ai/__init__.py)
Crawler script finished. Sleeping for 60 seconds before restarting...
Traceback (most recent call last):
  File "/app/crawler.py", line 4, in <module>
    from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
ImportError: cannot import name 'CrawlerRunConfig' from 'crawl4ai' (/app/crawl4ai/__init__.py)
Crawler script finished. Sleeping for 60 seconds before restarting...

mirozbiro avatar Feb 15 '25 05:02 mirozbiro

As I found, there is bunch of other stuff missing. Seems really related to the version that the image on public repo uses.
As per your doco, while it is recommended to use Docker, problem is that it is old missing many features which your public doco talks about.

So switching to different docker image (python) and then running RUN pip install -U crawl4ai RUN playwright install RUN playwright install-deps

Removes the problem.

mirozbiro avatar Feb 16 '25 06:02 mirozbiro

Hey, I am getting this error while even using crawl4ai==0.5.0, any Idea?

in from crawl4ai import AsyncWebCrawler, CrawlerRunConfig, CacheMode, LLMConfig ImportError: cannot import name 'LLMConfig' from 'crawl4ai' (C:\Python\Lib\site-packages\crawl4ai_init_.py)

piyushptiwari1 avatar Mar 25 '25 17:03 piyushptiwari1

@mirozbiro @piyushptiwari1

Hi, we’ve already updated the Docker image. I just ran a simple example of no LLM strategy, and it worked fine.

Could you please try using the latest Docker version and see if the issue still occurs? Also, if you can share the code you tested with, that would be really helpful for us to debug further.

ntohidi avatar May 02 '25 13:05 ntohidi

Checking over next few days.

mirozbiro avatar May 06 '25 10:05 mirozbiro

Hi @mirozbiro. I've tried this just now with our latest release 0.6.0 and I don't see this issue. Please try to pull the latest docker image from dockerhub and try again.

Reopen this issue if the problem still exists.

aravindkarnam avatar May 08 '25 05:05 aravindkarnam

Sounds good. Thanks.

mirozbiro avatar May 09 '25 11:05 mirozbiro