optimum Error onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn

Error onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn

Open medphisiker opened this issue 1 year ago • 4 comments

System Info

platform: Linux Ubuntu Server 20.04 x64

Python 3.8.10 (default, May 26 2023, 14:05:08) 
[GCC 9.4.0] on linux

python packages:

Package                   Version            Editable project location
------------------------- ------------------ --------------------------------------
absl-py                   1.4.0
accelerate                0.24.0
aiohttp                   3.8.4
aiosignal                 1.3.1
alembic                   1.10.2
antlr4-python3-runtime    4.9.3
anyio                     4.0.0
APScheduler               3.10.1
arrow                     1.2.3
async-timeout             4.0.2
attrs                     22.2.0
audioread                 3.0.0
av                        9.2.0
Babel                     2.12.1
backoff                   1.11.1
backports.zoneinfo        0.2.1
beautifulsoup4            4.11.2
binaryornot               0.4.4
black                     23.1.0
cachetools                5.3.0
certifi                   2022.12.7
cffi                      1.15.1
chardet                   5.1.0
charset-normalizer        3.1.0
click                     8.1.3
clldutils                 3.19.0
cloudpickle               2.2.1
cmaes                     0.9.1
cmake                     3.26.3
codecarbon                1.2.0
colorama                  0.4.6
coloredlogs               15.0.1
colorlog                  6.7.0
contourpy                 1.0.7
cookiecutter              1.7.3
csvw                      3.1.3
cycler                    0.11.0
dash                      2.8.1
dash-bootstrap-components 1.4.0
dash-core-components      2.0.0
dash-html-components      2.0.0
dash-table                5.0.0
datasets                  2.10.1
decorator                 5.1.1
decord                    0.6.0
Deprecated                1.2.14
detectron2                0.6
dill                      0.3.4
distlib                   0.3.6
dlinfo                    1.2.1
dnspython                 2.4.2
email-validator           2.1.0.post1
evaluate                  0.4.0
exceptiongroup            1.1.1
execnet                   1.9.0
faiss-cpu                 1.7.3
fastapi                   0.95.1
fastjsonschema            2.16.3
filelock                  3.9.1
fire                      0.5.0
Flask                     2.2.3
flatbuffers               23.3.3
fonttools                 4.39.3
frozenlist                1.3.3
fsspec                    2023.10.0
fugashi                   1.2.1
fvcore                    0.1.5.post20221221
gitdb                     4.0.10
GitPython                 3.1.18
google-auth               2.17.3
google-auth-oauthlib      1.0.0
gql                       3.4.0
graphql-core              3.2.3
greenlet                  2.0.2
grpcio                    1.51.3
h11                       0.14.0
hf-doc-builder            0.4.0
httpcore                  0.18.0
httptools                 0.6.1
httpx                     0.25.0
huggingface-hub           0.18.0
humanfriendly             10.0
hydra-core                1.3.2
hypothesis                6.68.3
idna                      3.4
importlib-metadata        6.0.0
importlib-resources       5.12.0
iniconfig                 2.0.0
iopath                    0.1.9
ipadic                    1.0.0
isodate                   0.6.1
isort                     5.12.0
itsdangerous              2.0.1
Jinja2                    3.1.2
jinja2-time               0.2.0
joblib                    1.2.0
jsonschema                4.17.3
jupyter_core              5.2.0
kenlm                     0.1
kiwisolver                1.4.4
language-tags             1.2.0
lazy_loader               0.1
librosa                   0.10.0
lit                       16.0.1
llvmlite                  0.39.1
lxml                      4.9.2
Mako                      1.2.4
Markdown                  3.4.1
MarkupSafe                2.1.2
matplotlib                3.7.1
mpmath                    1.3.0
msgpack                   1.0.5
multidict                 6.0.4
multiprocess              0.70.12.2
mypy-extensions           1.0.0
nbformat                  5.7.3
networkx                  3.1
nltk                      3.8.1
numba                     0.56.4
numpy                     1.23.5
nvidia-cublas-cu11        11.10.3.66
nvidia-cuda-nvrtc-cu11    11.7.99
nvidia-cuda-runtime-cu11  11.7.99
nvidia-cudnn-cu11         8.5.0.96
oauthlib                  3.2.2
omegaconf                 2.3.0
onnx                      1.13.1
onnxruntime               1.14.1
onnxruntime-gpu           1.16.1
onnxruntime-tools         1.7.0
optimum                   1.13.2
optuna                    3.1.0
orjson                    3.9.9
packaging                 23.0
pandas                    1.5.3
parameterized             0.8.1
pathspec                  0.11.1
phonemizer                3.2.1
Pillow                    9.4.0
pip                       23.3.1
pkgutil_resolve_name      1.3.10
plac                      1.3.5
platformdirs              3.1.1
plotly                    5.13.1
pluggy                    1.0.0
pooch                     1.7.0
portalocker               2.0.0
poyo                      0.5.0
protobuf                  3.20.2
psutil                    5.9.4
py-cpuinfo                9.0.0
py3nvml                   0.2.7
pyarrow                   11.0.0
pyasn1                    0.4.8
pyasn1-modules            0.2.8
pycocotools               2.0.6
pycparser                 2.21
pyctcdecode               0.5.0
pydantic                  1.10.7
pydantic-yaml             0.11.2
pygtrie                   2.5.0
pylatexenc                2.10
pynvml                    11.5.0
pyparsing                 3.0.9
pypng                     0.20220715.0
pyrsistent                0.19.3
pytesseract               0.3.10
pytest                    7.2.2
pytest-timeout            2.1.0
pytest-xdist              3.2.1
python-dateutil           2.8.2
python-dotenv             1.0.0
python-multipart          0.0.6
python-slugify            8.0.1
pytz                      2022.7.1
pytz-deprecation-shim     0.1.0.post0
PyYAML                    5.4.1
ray                       2.3.0
rdflib                    6.2.0
regex                     2022.10.31
requests                  2.28.2
requests-oauthlib         1.3.1
requests-toolbelt         0.10.1
responses                 0.18.0
rfc3986                   1.5.0
rhoknp                    1.2.1
rjieba                    0.1.11
rouge-score               0.1.2
rsa                       4.9
ruff                      0.0.256
sacrebleu                 1.5.1
sacremoses                0.0.53
safetensors               0.3.0
scikit-learn              1.2.2
scipy                     1.10.1
segments                  2.2.1
sentencepiece             0.1.97
setuptools                68.2.2
sigopt                    8.7.0
six                       1.16.0
smmap                     5.0.0
sniffio                   1.3.0
sortedcontainers          2.4.0
soundfile                 0.12.1
soupsieve                 2.4
soxr                      0.3.4
SQLAlchemy                2.0.6
starlette                 0.26.1
SudachiDict-core          20230110
SudachiPy                 0.6.7
sympy                     1.11.1
tabulate                  0.9.0
tenacity                  8.2.2
tensorboard               2.12.2
tensorboard-data-server   0.7.0
tensorboard-plugin-wit    1.8.1
tensorboardX              2.6
termcolor                 2.2.0
text-unidecode            1.3
threadpoolctl             3.1.0
timeout-decorator         0.5.0
timm                      0.6.12
tokenizers                0.13.2
tomli                     2.0.1
torch                     1.13.0
torchaudio                2.0.1+cu117
torchvision               0.15.1+cu117
tqdm                      4.65.0
traitlets                 5.9.0
transformers              4.29.2             /usr/local/lib/python3.8/dist-packages
triton                    2.0.0
types-Deprecated          1.2.9.3
typing_extensions         4.5.0
tzdata                    2022.7
tzlocal                   4.2
ujson                     5.8.0
unidic                    1.1.0
unidic-lite               1.0.8
uritemplate               4.1.1
urllib3                   1.26.15
uvicorn                   0.22.0
uvloop                    0.19.0
virtualenv                20.21.0
wasabi                    0.10.1
watchfiles                0.21.0
websockets                12.0
Werkzeug                  2.2.3
wheel                     0.34.2
wrapt                     1.15.0
xmltodict                 0.13.0
xxhash                    3.2.0
yacs                      0.1.8
yarl                      1.8.2
zipp                      3.15.0

Who can help?

@JingyaHuang, @echarlaix

Information

[X] The official example scripts
[X] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction (minimal, reproducible, runnable)

Hello,

I write this Dockerfile

FROM huggingface/transformers-pytorch-gpu:4.29.2
ARG DEBIAN_FRONTEND=noninteractive

WORKDIR /src
ENV PYTHONPATH="${PYTHONPATH}:${WORKDIR}"
ENV TRANSFORMERS_CACHE="/src/.cache/"

COPY requirements.txt $WORKDIR

RUN apt-get update && apt upgrade -y && \
		apt-get install -y libsm6 libxrender1 libfontconfig1 libxext6 libgl1-mesa-glx ffmpeg && \
		pip install -U pip setuptools && \
		pip install -U --no-cache-dir -r requirements.txt

COPY . $WORKDIR

I also have this requirements.txt file

optimum[onnxruntime-gpu]==1.8.6
transformers==4.29.2
torch==1.13.0
fastapi[all]==0.95.1
uvicorn==0.22.0
numpy==1.23.5
pydantic==1.10.7
pydantic-yaml==0.11.2

Then I build it by command: docker build -t my_docker_image .

Then I run it by command: docker run -it --rm --gpus 0 my_docker_image

Then I write this python script test_import.py inside my docker-container:

from optimum.onnxruntime import ORTOptimizer
from optimum.onnxruntime import ORTModelForSequenceClassification
from optimum.onnxruntime import ORTQuantizer
from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig

Then I run this python script by command: python3 test_import.py

And I get that error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1172, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/optimization.py", line 31, in <module>
    from .configuration import OptimizationConfig, ORTConfig
  File "/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/configuration.py", line 27, in <module>
    from onnxruntime.quantization import CalibraterBase, CalibrationMethod, QuantFormat, QuantizationMode, QuantType
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/__init__.py", line 1, in <module>
    from .calibrate import (  # noqa: F401
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/calibrate.py", line 21, in <module>
    from .quant_utils import apply_plot, load_model_with_shape_infer, smooth_distribution
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/quant_utils.py", line 115, in <module>
    onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn,
AttributeError: FLOAT8E4M3FN

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test_import.py", line 1, in <module>
    from optimum.onnxruntime import ORTOptimizer
  File "<frozen importlib._bootstrap>", line 1039, in _handle_fromlist
  File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1162, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1174, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import optimum.onnxruntime.optimization because of the following error (look up to see its traceback):
FLOAT8E4M3FN

Then I comment on the line that leads to the error. And now test_import.py looks like this:

# from optimum.onnxruntime import ORTOptimizer
from optimum.onnxruntime import ORTModelForSequenceClassification
from optimum.onnxruntime import ORTQuantizer
from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig

I run it again and I get new error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1172, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/quantization.py", line 28, in <module>
    from onnxruntime.quantization import CalibrationDataReader, QuantFormat, QuantizationMode, QuantType
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/__init__.py", line 1, in <module>
    from .calibrate import (  # noqa: F401
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/calibrate.py", line 21, in <module>
    from .quant_utils import apply_plot, load_model_with_shape_infer, smooth_distribution
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/quant_utils.py", line 115, in <module>
    onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn,
AttributeError: FLOAT8E4M3FN

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test_import.py", line 3, in <module>
    from optimum.onnxruntime import ORTQuantizer
  File "<frozen importlib._bootstrap>", line 1039, in _handle_fromlist
  File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1162, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "/usr/local/lib/python3.8/dist-packages/transformers/utils/import_utils.py", line 1174, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import optimum.onnxruntime.quantization because of the following error (look up to see its traceback):
FLOAT8E4M3FN

Then I comment that problem line. And now test_import.py looks like this:

# from optimum.onnxruntime import ORTOptimizer
from optimum.onnxruntime import ORTModelForSequenceClassification
# from optimum.onnxruntime import ORTQuantizer
from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig

And I run it again:

Traceback (most recent call last):
  File "test_import.py", line 4, in <module>
    from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig
  File "/usr/local/lib/python3.8/dist-packages/optimum/onnxruntime/configuration.py", line 27, in <module>
    from onnxruntime.quantization import CalibraterBase, CalibrationMethod, QuantFormat, QuantizationMode, QuantType
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/__init__.py", line 1, in <module>
    from .calibrate import (  # noqa: F401
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/calibrate.py", line 21, in <module>
    from .quant_utils import apply_plot, load_model_with_shape_infer, smooth_distribution
  File "/usr/local/lib/python3.8/dist-packages/onnxruntime/quantization/quant_utils.py", line 115, in <module>
    onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn,
AttributeError: FLOAT8E4M3FN

I comment this line too. And now test_import.py looks like this:

# from optimum.onnxruntime import ORTOptimizer
from optimum.onnxruntime import ORTModelForSequenceClassification
# from optimum.onnxruntime import ORTQuantizer
# from optimum.onnxruntime.configuration import AutoQuantizationConfig, OptimizationConfig

I run it again and there are no errors.

Please help me fix these errors.

Expected behavior

I expected that the import of these classes would happen without errors and I would be able to optimize the graph of models and quantize them to run on my nVidia A10 graphics card.

Oct 24 '23 20:10 medphisiker

optimum optimum copied to clipboard

Error onnx_proto.TensorProto.FLOAT8E4M3FN: float8e4m3fn

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

optimum
optimum copied to clipboard