Zheng.Deng
Zheng.Deng
i am the same as heekhero,anybody know how to handle the problem?
i ve solved this problem(while i ve got another one after that),firstly trian.py import resource rlimit = resource.getrlimit(resource.RLIMIT_NOFILE) resource.setrlimit(resource.RLIMIT_NOFILE, (20480, rlimit[1])) cause this problem,u should add a "print(rlimit)" after rlimit...
anaconda 虚拟环境设置的 ------------------ 原始邮件 ------------------ 发件人: "yingxingde/FasterRCNN-pytorch"
没有使用这个工程的模型,但是可以使用optimum-cli转pth到onnx,你可以试一下https://huggingface.co/nenkoru/llama-7b-onnx-merged-fp16
try to run int8 model with w8a8 via api_server.py, when i call api_client.py got error File "/vllm/vllm/model_executor/layers/attention.py", line 255, in forward cache_ops.reshape_and_cache( RuntimeError: expected scalar type Long but found Int...
> Hi @dengzheng-cloud, > > Gemma 2B model does not need to be converted into LLInference format if downloaded from Kaggle. This is documented [here](https://developers.google.com/mediapipe/solutions/genai/llm_inference/android#convert-model). If you are not using...