Dominic23331
Dominic23331
In some scenarios, the image encoder may need to export onnx. I attempted to export it and deploy it using onnxruntime, which showed a significant improvement in speed. Can I...
This allow export image encoder to onnx. I create a new file to export it.
### Describe the feature I suggest to support ghostnet: https://arxiv.org/abs/1911.11907 ### Will you implement it? - [X] I would like to implement this feature and create a PR!
### What is the issue? I use Jetson Nano to run ollama, but when I check jtop, I find my GPU is not work, the ollama is running in CPU....
我在训练时输出以下内容后,程序就停止了,请问这种情况该如何解决? `2024-05-15 09:29:44.939294: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2024-05-15 09:29:44.939347: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register...
我在树莓派3b+上运行chatglm2,无法打开模型,请问这种情况怎么解决? `main: seed = 1715851844 Assert ' m_file ' failed at file : /home/dominic/project/InferLLM-main/src/file.cpp line 10 : inferllm::InputFile::InputFile(const std::string &, bool), extra message: Failed to open model file.Aborted `
This is the flash attention code that I have encapsulated ` # flash-attention import math import torch import torch.nn as nn from torch.nn.init import ( xavier_uniform_, constant_, xavier_normal_ ) from...
I want to use a custom attention mask. Does flash attention support it? What should I do?