optimum
optimum copied to clipboard
SegFault when export llama with optimize=O4
System Info
- `optimum` version: 1.14.0
- `transformers` version: 4.35.0
- Platform: Linux-5.15.0-88-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.16.4
- PyTorch version (GPU?): 2.0.1+cu117 (cuda availabe: True)
- Tensorflow version (GPU?): not installed (cuda availabe: NA)
- onnxruntime-gpu: 1.16.1
- onnxruntime: 1.16.1
Who can help?
@michaelbenayoun
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction (minimal, reproducible, runnable)
when I run:
optimum-cli export onnx --model PY007/TinyLlama-1.1B-Chat-v0.3 --task text-generation-with-past --fp16 --optimize O4 --for-ort --device cuda TinyLlama-1.1B-Chat-v0.3-onnx
on step optimization model I got segmentation fault:
Optimizing model...
2023-11-13 11:25:38.077517844 [W:onnxruntime:, session_state.cc:1162 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2023-11-13 11:25:38.077534334 [W:onnxruntime:, session_state.cc:1164 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
symbolic shape inference disabled or failed.
Segmentation fault (core dumped)
but without --optimize
export is successful
Expected behavior
model will be exported