wsl2 docker LLM 服务超时

Open ye-jeck opened this issue 1 year ago • 1 comments

执行命令"bash ./run.sh -c local -i 0 -b hf -m MiniChat-2-3B -t minich"之后一直处于循环启动，最后超时，怀疑读取到的大模型运行出了异常，没有找到对应日志文件，求大佬看看这个问题怎么解决

qanything-container-local | Embedding 和 Rerank 服务已准备就绪！(7.5/8) qanything-container-local | 2024-05-14 15:38:49 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=7801, worker_address='http://0.0.0.0:7801', controller_address='http://0.0.0.0:7800', model_path='/model_repos/CustomLLM/MiniChat-2-3B', revision='main', device='cuda', gpus='0', num_gpus=1, max_gpu_memory=None, dtype='bfloat16', load_8bit=True, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=16, gptq_groupsize=-1, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, enable_exllama=False, exllama_max_seq_len=4096, exllama_gpu_split=None, exllama_cache_8bit=False, enable_xft=False, xft_max_seq_len=4096, xft_dtype=None, model_names=None, conv_template='minichat', embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None, debug=False, ssl=False) qanything-container-local | 2024-05-14 15:38:49 | INFO | model_worker | Loading the model ['MiniChat-2-3B'] on worker 3d1353e2 ... qanything-container-local | You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the legacy (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set legacy=False. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 0%| | 0/1 [00:00<?, ?it/s]14 15:38:50 | ERROR | stderr | qanything-container-local | % Total % Received % Xferd Average Speed Time Time Time Current qanything-container-local | Dload Upload Total Spent Left Speed 100 13 100 13 0 0 92 0 --:--:-- --:--:-- --:--:-- 92 qanything-container-local | The llm service is starting up, it can be long... you have time to make a coffee :) qanything-container-local | LLM 服务正在启动，可能需要一段时间...你有时间去冲杯咖啡 :) qanything-container-local | % Total % Received % Xferd Average Speed Time Time Time Current qanything-container-local | Dload Upload Total Spent Left Speed 100 13 100 13 0 0 7030 0 --:--:-- --:--:-- --:--:-- 13000 qanything-container-local | The llm service is starting up, it can be long... you have time to make a coffee :) qanything-container-local | LLM 服务正在启动，可能需要一段时间...你有时间去冲杯咖啡 :) qanything-container-local | % Total % Received % Xferd Average Speed Time Time Time Current qanything-container-local | Dload Upload Total Spent Left Speed 100 13 100 13 0 0 12896 0 --:--:-- --:--:-- --:--:-- 13000 qanything-container-local | The llm service is starting up, it can be long... you have time to make a coffee :) qanything-container-local | LLM 服务正在启动，可能需要一段时间...你有时间去冲杯咖啡 :) qanything-container-local | % Total % Received % Xferd Average Speed Time Time Time Current qanything-container-local | Dload Upload Total Spent Left Speed 100 13 100 13 0 0 12380 0 --:--:-- --:--:-- --:--:-- 13000 qanything-container-local | The llm service is starting up, it can be long... you have time to make a coffee :) qanything-container-local | LLM 服务正在启动，可能需要一段时间...你有时间去冲杯咖啡 :) qanything-container-local | % Total % Received % Xferd Average Speed Time Time Time Current qanything-container-local | Dload Upload Total Spent Left Speed 100 13 100 13 0 0 13713 0 --:--:-- --:--:-- --:--:-- 13000 qanything-container-local | The llm service is starting up, it can be long... you have time to make a coffee :) qanything-container-local | LLM 服务正在启动，可能需要一段时间...你有时间去冲杯咖啡 :) qanything-container-local | % Total % Received % Xferd Average Speed Time Time Time Current qanything-container-local | Dload Upload Total Spent Left Speed 100 13 100 13 0 0 7484 0 --:--:-- --:--:-- --:--:-- 13000 qanything-container-local | The llm service is starting up, it can be long... you have time to make a coffee :) qanything-container-local | LLM 服务正在启动，可能需要一段时间...你有时间去冲杯咖啡 :)

May 14 '24 07:05 ye-jeck

模型太大了，换个小点的或者换张显存大点的卡

Jul 09 '24 09:07 LossHu