Zheng.Deng comments

Results 6 comments of


                                            Zheng.Deng

trafficstars

why i got this error?

i am the same as heekhero,anybody know how to handle the problem?

i ve solved this problem(while i ve got another one after that),firstly trian.py import resource rlimit = resource.getrlimit(resource.RLIMIT_NOFILE) resource.setrlimit(resource.RLIMIT_NOFILE, (20480, rlimit[1])) cause this problem,u should add a "print(rlimit)" after rlimit...

请问有requirements.txt吗？

anaconda 虚拟环境设置的 ------------------ 原始邮件 ------------------ 发件人: "yingxingde/FasterRCNN-pytorch"

转onnx

没有使用这个工程的模型，但是可以使用optimum-cli转pth到onnx，你可以试一下https://huggingface.co/nenkoru/llama-7b-onnx-merged-fp16

Support W8A8 inference in vllm

try to run int8 model with w8a8 via api_server.py, when i call api_client.py got error File "/vllm/vllm/model_executor/layers/attention.py", line 255, in forward cache_ops.reshape_and_cache( RuntimeError: expected scalar type Long but found Int...

Converted Gemma 2b tflite generate same token

> Hi @dengzheng-cloud, > > Gemma 2B model does not need to be converted into LLInference format if downloaded from Kaggle. This is documented [here](https://developers.google.com/mediapipe/solutions/genai/llm_inference/android#convert-model). If you are not using...