gpt_academic icon indicating copy to clipboard operation
gpt_academic copied to clipboard

[Bug]: 找不到GPU

Open blue-eyed-g opened this issue 10 months ago • 2 comments

Installation Method | 安装方法与平台

Docker-Compose(Linux)

Version | 版本

Latest | 最新版

OS | 操作系统

Linux

Describe the bug | 简述

这是我的docker-compose.yml version: '3' services: gpt_academic_with_latex: image: ghcr.io/binary-husky/gpt_academic_with_latex:master environment: API_KEY: 'xxxx' USE_PROXY: 'False' proxies: '{"http": "xxxx", "https": "xxxx"}' API_URL_REDIRECT: '{"xxxxx": "xxxxx"}' LLM_MODEL: xxxx' AVAIL_LLM_MODELS: '["xxxxxx"]' GEMINI_API_KEY: 'xxxxxx' LOCAL_MODEL_DEVICE: 'cuda' DEFAULT_WORKER_NUM: '10' WEB_PORT: '12303'

ports:
  - "12303:12303"
command: >
  bash -c "python3 -u main.py"

实际上我的电脑上是有独显的,我也能使用docker运行ollama

Screen Shot | 有帮助的截图

image image

Terminal Traceback & Material to Help Reproduce Bugs | 终端traceback(如有) + 帮助我们复现的测试材料样本(如有)

WARNING:root:No GPU found. Conversion on CPU is very slow. usage: nougat [-h] [--batchsize BATCHSIZE] [--checkpoint CHECKPOINT] [--model MODEL] [--out OUT] [--recompute] [--full-precision] [--no-markdown] [--markdown] [--no-skipping] [--pages PAGES] pdf [pdf ...] nougat: error: the following arguments are required: pdf

blue-eyed-g avatar Apr 17 '24 17:04 blue-eyed-g

也许我找到问题了,容器内的CUDA版本和主机的不一样,主机的是12.4,容器内的是12.1


nvidia-cublas-cu12==12.4.5.8 is available (you have 12.1.3.1) nvidia-cuda-cupti-cu12==12.4.127 is available (you have 12.1.105) nvidia-cuda-nvrtc-cu12==12.4.127 is available (you have 12.1.105) nvidia-cuda-runtime-cu12==12.4.127 is available (you have 12.1.105)


blue-eyed-g avatar Apr 17 '24 17:04 blue-eyed-g

docker-compose少写了英伟达运行时的参数吧

这是我的docker-compose.yml:

version: '3'
services:
  gpt_academic_full_capability:
    image: ghcr.io/binary-husky/gpt_academic_with_all_capacity:master
    environment:
      # 请查阅 `config.py`或者 github wiki 以查看所有的配置信息
      API_KEY:                  '  sk-114514                        '
      LLM_MODEL:                '  gpt-3.5-turbo                                                              '
      AVAIL_LLM_MODELS:         '  ["gpt-3.5-turbo", "gpt-4-turbo-preview","claude-3-sonnet-20240229","claude-3-opus-20240229", "glm-4"]       '
      DEFAULT_WORKER_NUM:       '  10                                                                         '
      WEB_PORT:                 '  1919                                                                     '
      THEME:                    '  Default                                                '
      LOCAL_MODEL_DEVICE:       '  cuda                                                                       '
    deploy:
        resources:
          reservations:
            devices:
              - driver: nvidia
                count: 1
                capabilities: [gpu]

    # 【WEB_PORT暴露方法1: 适用于Linux】与宿主的网络融合
    #network_mode: "host"

    # 【WEB_PORT暴露方法2: 适用于所有系统】端口映射
    ports:
     - "1919:1919"  # 12345必须与WEB_PORT相互对应

    # 启动容器后,运行main.py主程序
    command: >
      bash -c "python3 -u main.py"

我主机也是12.4版本的CUDA,但是启动容器的时候还是看得到容器内CUDA正常运行的

$ pacman -Qs cuda
local/cuda 12.4.1-1
    NVIDIA's GPU programming toolkit

图片

Menghuan1918 avatar Apr 19 '24 05:04 Menghuan1918

docker-compose少写了英伟达运行时的参数吧

这是我的docker-compose.yml:

version: '3'
services:
  gpt_academic_full_capability:
    image: ghcr.io/binary-husky/gpt_academic_with_all_capacity:master
    environment:
      # 请查阅 `config.py`或者 github wiki 以查看所有的配置信息
      API_KEY:                  '  sk-114514                        '
      LLM_MODEL:                '  gpt-3.5-turbo                                                              '
      AVAIL_LLM_MODELS:         '  ["gpt-3.5-turbo", "gpt-4-turbo-preview","claude-3-sonnet-20240229","claude-3-opus-20240229", "glm-4"]       '
      DEFAULT_WORKER_NUM:       '  10                                                                         '
      WEB_PORT:                 '  1919                                                                     '
      THEME:                    '  Default                                                '
      LOCAL_MODEL_DEVICE:       '  cuda                                                                       '
    deploy:
        resources:
          reservations:
            devices:
              - driver: nvidia
                count: 1
                capabilities: [gpu]

    # 【WEB_PORT暴露方法1: 适用于Linux】与宿主的网络融合
    #network_mode: "host"

    # 【WEB_PORT暴露方法2: 适用于所有系统】端口映射
    ports:
     - "1919:1919"  # 12345必须与WEB_PORT相互对应

    # 启动容器后,运行main.py主程序
    command: >
      bash -c "python3 -u main.py"

我主机也是12.4版本的CUDA,但是启动容器的时候还是看得到容器内CUDA正常运行的

$ pacman -Qs cuda
local/cuda 12.4.1-1
    NVIDIA's GPU programming toolkit

图片

好的,感谢,解决了

blue-eyed-g avatar Apr 22 '24 07:04 blue-eyed-g