nuaabuaa07
nuaabuaa07
大佬求教,“修改 docker-conpose.yml 邮箱相关项”。这个我不太理解意思。这里面具体需要添加哪些项目呢?里面的邮箱是我们常用的QQ邮箱那种吗?可否提供一个样例
this is my config in .env but still got error this is error msg: ERROR [pilot.scene.base_chat] model response parase faild!Model server error!code=1, errmsg is **LLMServer Generate Error, Please CheckErrorInfo.**: Error...
我也出现了类似报错,今天重新拉取代码后,确实不报错了,正常了
> I suprisingly found that after I updated ollama verison, the sha:256 related problem disappears. I will work on the performance of this quantized model recently. Thanks for the community's...
难道,推理服务,只能部署在单GPU的机器上?
单卡时报内存不足。 torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 22.20 GiB total capacity; 21.53 GiB already allocated; 48.12 MiB free; 21.55 GiB reserved in total by...
我报的错误和你相似,区别是在发现,在两个显卡上,不允许。还没找到解决方法
我A10 双卡,也报不支持多卡错误。可以详细说一下,如何多卡使用吗?
量化8bit 加载模型,是这样配置吗 ` model = LlamaForCausalLM.from_pretrained( ziya_model_path, # torch_dtype=torch.float16, load_in_8bit=True, device_map="auto", ) `
> 量化8bit 加载模型,是这样配置吗 ` model = LlamaForCausalLM.from_pretrained( ziya_model_path, 直接加 load_in_8bit=True 会报错需要使用。需要这样 `python nf4_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4") model = LlamaForCausalLM.from_pretrained( ziya_model_path, quantization_config=nf4_config, device_map='auto' ) `