optimum
optimum copied to clipboard
Error onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn
System Info
platform: Linux Ubuntu Server 20.04 x64
Python 3.8.10 (default, May 26 2023, 14:05:08)
[GCC 9.4.0] on linux
python packages:
Package Version Editable project location
------------------------- ------------------ --------------------------------------
absl-py 1.4.0
accelerate 0.24.0
aiohttp 3.8.4
aiosignal 1.3.1
alembic 1.10.2
antlr4-python3-runtime 4.9.3
anyio 4.0.0
APScheduler 3.10.1
arrow 1.2.3
async-timeout 4.0.2
attrs 22.2.0
audioread 3.0.0
av 9.2.0
Babel 2.12.1
backoff 1.11.1
backports.zoneinfo 0.2.1
beautifulsoup4 4.11.2
binaryornot 0.4.4
black 23.1.0
cachetools 5.3.0
certifi 2022.12.7
cffi 1.15.1
chardet 5.1.0
charset-normalizer 3.1.0
click 8.1.3
clldutils 3.19.0
cloudpickle 2.2.1
cmaes 0.9.1
cmake 3.26.3
codecarbon 1.2.0
colorama 0.4.6
coloredlogs 15.0.1
colorlog 6.7.0
contourpy 1.0.7
cookiecutter 1.7.3
csvw 3.1.3
cycler 0.11.0
dash 2.8.1
dash-bootstrap-components 1.4.0
dash-core-components 2.0.0
dash-html-components 2.0.0
dash-table 5.0.0
datasets 2.10.1
decorator 5.1.1
decord 0.6.0
Deprecated 1.2.14
detectron2 0.6
dill 0.3.4
distlib 0.3.6
dlinfo 1.2.1
dnspython 2.4.2
email-validator 2.1.0.post1
evaluate 0.4.0
exceptiongroup 1.1.1
execnet 1.9.0
faiss-cpu 1.7.3
fastapi 0.95.1
fastjsonschema 2.16.3
filelock 3.9.1
fire 0.5.0
Flask 2.2.3
flatbuffers 23.3.3
fonttools 4.39.3
frozenlist 1.3.3
fsspec 2023.10.0
fugashi 1.2.1
fvcore 0.1.5.post20221221
gitdb 4.0.10
GitPython 3.1.18
google-auth 2.17.3
google-auth-oauthlib 1.0.0
gql 3.4.0
graphql-core 3.2.3
greenlet 2.0.2
grpcio 1.51.3
h11 0.14.0
hf-doc-builder 0.4.0
httpcore 0.18.0
httptools 0.6.1
httpx 0.25.0
huggingface-hub 0.18.0
humanfriendly 10.0
hydra-core 1.3.2
hypothesis 6.68.3
idna 3.4
importlib-metadata 6.0.0
importlib-resources 5.12.0
iniconfig 2.0.0
iopath 0.1.9
ipadic 1.0.0
isodate 0.6.1
isort 5.12.0
itsdangerous 2.0.1
Jinja2 3.1.2
jinja2-time 0.2.0
joblib 1.2.0
jsonschema 4.17.3
jupyter_core 5.2.0
kenlm 0.1
kiwisolver 1.4.4
language-tags 1.2.0
lazy_loader 0.1
librosa 0.10.0
lit 16.0.1
llvmlite 0.39.1
lxml 4.9.2
Mako 1.2.4
Markdown 3.4.1
MarkupSafe 2.1.2
matplotlib 3.7.1
mpmath 1.3.0
msgpack 1.0.5
multidict 6.0.4
multiprocess 0.70.12.2
mypy-extensions 1.0.0
nbformat 5.7.3
networkx 3.1
nltk 3.8.1
numba 0.56.4
numpy 1.23.5
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
oauthlib 3.2.2
omegaconf 2.3.0
onnx 1.13.1
onnxruntime 1.14.1
onnxruntime-gpu 1.16.1
onnxruntime-tools 1.7.0
optimum 1.13.2
optuna 3.1.0
orjson 3.9.9
packaging 23.0
pandas 1.5.3
parameterized 0.8.1
pathspec 0.11.1
phonemizer 3.2.1
Pillow 9.4.0
pip 23.3.1
pkgutil_resolve_name 1.3.10
plac 1.3.5
platformdirs 3.1.1
plotly 5.13.1
pluggy 1.0.0
pooch 1.7.0
portalocker 2.0.0
poyo 0.5.0
protobuf 3.20.2
psutil 5.9.4
py-cpuinfo 9.0.0
py3nvml 0.2.7
pyarrow 11.0.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycocotools 2.0.6
pycparser 2.21
pyctcdecode 0.5.0
pydantic 1.10.7
pydantic-yaml 0.11.2
pygtrie 2.5.0
pylatexenc 2.10
pynvml 11.5.0
pyparsing 3.0.9
pypng 0.20220715.0
pyrsistent 0.19.3
pytesseract 0.3.10
pytest 7.2.2
pytest-timeout 2.1.0
pytest-xdist 3.2.1
python-dateutil 2.8.2
python-dotenv 1.0.0
python-multipart 0.0.6
python-slugify 8.0.1
pytz 2022.7.1
pytz-deprecation-shim 0.1.0.post0
PyYAML 5.4.1
ray 2.3.0
rdflib 6.2.0
regex 2022.10.31
requests 2.28.2
requests-oauthlib 1.3.1
requests-toolbelt 0.10.1
responses 0.18.0
rfc3986 1.5.0
rhoknp 1.2.1
rjieba 0.1.11
rouge-score 0.1.2
rsa 4.9
ruff 0.0.256
sacrebleu 1.5.1
sacremoses 0.0.53
safetensors 0.3.0
scikit-learn 1.2.2
scipy 1.10.1
segments 2.2.1
sentencepiece 0.1.97
setuptools 68.2.2
sigopt 8.7.0
six 1.16.0
smmap 5.0.0
sniffio 1.3.0
sortedcontainers 2.4.0
soundfile 0.12.1
soupsieve 2.4
soxr 0.3.4
SQLAlchemy 2.0.6
starlette 0.26.1
SudachiDict-core 20230110
SudachiPy 0.6.7
sympy 1.11.1
tabulate 0.9.0
tenacity 8.2.2
tensorboard 2.12.2
tensorboard-data-server 0.7.0
tensorboard-plugin-wit 1.8.1
tensorboardX 2.6
termcolor 2.2.0
text-unidecode 1.3
threadpoolctl 3.1.0
timeout-decorator 0.5.0
timm 0.6.12
tokenizers 0.13.2
tomli 2.0.1
torch 1.13.0
torchaudio 2.0.1+cu117
torchvision 0.15.1+cu117
tqdm 4.65.0
traitlets 5.9.0
transformers 4.29.2 /usr/local/lib/python3.8/dist-packages
triton 2.0.0
types-Deprecated 1.2.9.3
typing_extensions 4.5.0
tzdata 2022.7
tzlocal 4.2
ujson 5.8.0
unidic 1.1.0
unidic-lite 1.0.8
uritemplate 4.1.1
urllib3 1.26.15
uvicorn 0.22.0
uvloop 0.19.0
virtualenv 20.21.0
wasabi 0.10.1
watchfiles 0.21.0
websockets 12.0
Werkzeug 2.2.3
wheel 0.34.2
wrapt 1.15.0
xmltodict 0.13.0
xxhash 3.2.0
yacs 0.1.8
yarl 1.8.2
zipp 3.15.0
Who can help?
@JingyaHuang, @echarlaix
Information
- [X] The official example scripts
- [X] My own modified scripts
Tasks
- [X] An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction (minimal, reproducible, runnable)
Hello,
I write this Dockerfile
FROM huggingface/transformers-pytorch-gpu:4.29.2
ARG DEBIAN_FRONTEND=noninteractive
WORKDIR /src
ENV PYTHONPATH="${PYTHONPATH}:${WORKDIR}"
ENV TRANSFORMERS_CACHE="/src/.cache/"
COPY requirements.txt $WORKDIR
RUN apt-get update && apt upgrade -y && \
apt-get install -y libsm6 libxrender1 libfontconfig1 libxext6 libgl1-mesa-glx ffmpeg && \
pip install -U pip setuptools && \
pip install -U --no-cache-dir -r requirements.txt
COPY . $WORKDIR
I also have this requirements.txt
file
optimum[onnxruntime-gpu]==1.8.6
transformers==4.29.2
torch==1.13.0
fastapi[all]==0.95.1
uvicorn==0.22.0
numpy==1.23.5
pydantic==1.10.7
pydantic-yaml==0.11.2
Then I build it by command:
docker build -t my_docker_image .
Then I run it by command:
docker run -it --rm --gpus 0 my_docker_image
Then I write this python script test_import.py
inside my docker-container:
from optimum.onnxruntime import ORTOptimizer
from optimum.onnxruntime import ORTModelForSequenceClassification
from optimum.onnxruntime import ORTQuantizer
from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig
Then I run this python script by command:
python3 test_import.py
And I get that error:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1172, in _get_module
return importlib.import_module("." + module_name, self.__name__)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/optimization.py", line 31, in <module>
from .configuration import OptimizationConfig, ORTConfig
File "/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/configuration.py", line 27, in <module>
from onnxruntime.quantization import CalibraterBase, CalibrationMethod, QuantFormat, QuantizationMode, QuantType
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/__init__.py", line 1, in <module>
from .calibrate import ( # noqa: F401
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/calibrate.py", line 21, in <module>
from .quant_utils import apply_plot, load_model_with_shape_infer, smooth_distribution
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/quant_utils.py", line 115, in <module>
onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn,
AttributeError: FLOAT8E4M3FN
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "test_import.py", line 1, in <module>
from optimum.onnxruntime import ORTOptimizer
File "<frozen importlib._bootstrap>", line 1039, in _handle_fromlist
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1162, in __getattr__
module = self._get_module(self._class_to_module[name])
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1174, in _get_module
raise RuntimeError(
RuntimeError: Failed to import optimum.onnxruntime.optimization because of the following error (look up to see its traceback):
FLOAT8E4M3FN
Then I comment on the line that leads to the error. And now test_import.py
looks like this:
# from optimum.onnxruntime import ORTOptimizer
from optimum.onnxruntime import ORTModelForSequenceClassification
from optimum.onnxruntime import ORTQuantizer
from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig
I run it again and I get new error:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1172, in _get_module
return importlib.import_module("." + module_name, self.__name__)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
File "<frozen importlib._bootstrap>", line 991, in _find_and_load
File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 848, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/quantization.py", line 28, in <module>
from onnxruntime.quantization import CalibrationDataReader, QuantFormat, QuantizationMode, QuantType
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/__init__.py", line 1, in <module>
from .calibrate import ( # noqa: F401
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/calibrate.py", line 21, in <module>
from .quant_utils import apply_plot, load_model_with_shape_infer, smooth_distribution
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/quant_utils.py", line 115, in <module>
onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn,
AttributeError: FLOAT8E4M3FN
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "test_import.py", line 3, in <module>
from optimum.onnxruntime import ORTQuantizer
File "<frozen importlib._bootstrap>", line 1039, in _handle_fromlist
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1162, in __getattr__
module = self._get_module(self._class_to_module[name])
File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1174, in _get_module
raise RuntimeError(
RuntimeError: Failed to import optimum.onnxruntime.quantization because of the following error (look up to see its traceback):
FLOAT8E4M3FN
Then I comment that problem line. And now test_import.py
looks like this:
# from optimum.onnxruntime import ORTOptimizer
from optimum.onnxruntime import ORTModelForSequenceClassification
# from optimum.onnxruntime import ORTQuantizer
from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig
And I run it again:
Traceback (most recent call last):
File "test_import.py", line 4, in <module>
from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig
File "/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/configuration.py", line 27, in <module>
from onnxruntime.quantization import CalibraterBase, CalibrationMethod, QuantFormat, QuantizationMode, QuantType
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/__init__.py", line 1, in <module>
from .calibrate import ( # noqa: F401
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/calibrate.py", line 21, in <module>
from .quant_utils import apply_plot, load_model_with_shape_infer, smooth_distribution
File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/quant_utils.py", line 115, in <module>
onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn,
AttributeError: FLOAT8E4M3FN
I comment this line too. And now test_import.py
looks like this:
# from optimum.onnxruntime import ORTOptimizer
from optimum.onnxruntime import ORTModelForSequenceClassification
# from optimum.onnxruntime import ORTQuantizer
# from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig
I run it again and there are no errors.
Please help me fix these errors.
Expected behavior
I expected that the import of these classes would happen without errors and I would be able to optimize the graph of models and quantize them to run on my nVidia A10 graphics card.