inference
inference copied to clipboard
can't find model "tokenization"
language models -> bert -> accuracy-squad.py
import numpy as np
import six
import tokenization
from transformers import BertTokenizer
from create_squad_data import read_squad_examples, convert_examples_to_features
I follow the steps in the readme file and start a docker container to run Bert evaluation. However, I got the above error, and I even can't find a third module (tokenization) that satisfied the API in the script (accuracy-squad.py).
Can anyone tell me where I can find this module (tokenization)? Thank you very much.
tokenization.py is here: https://github.com/NVIDIA/DeepLearningExamples/blob/b03375bd6c2c5233130e61a3be49e26d1a20ac7c/TensorFlow/LanguageModeling/BERT/tokenization.py
Could you add
sys.path.insert(0, "DeepLearningExamples/TensorFlow/LanguageModeling/BERT")
and see if it helps?
Hi,
This bug was introduced as part of the decoupling work I did for BERT. I have two fixes, one with and one without the use of env variables. I'm currently on leave until Friday 17th, but my team may upstream either fix in the meantime.
@LukeIreland1 Are there any updates to this issue. If there are none, this issue will be closed next week.
@nsircombe, do we still plan to upstream (one of) our fix(es) to BERT accuracy mode?
do we still plan to upstream (one of) our fix(es) to BERT accuracy mode?
@LukeIreland1 It would be great if you can provide a fix and file a PR so that we can merge it into upstream? Thanks
Can you please share (or merge) the fix for this issue?
Can you please share (or merge) the fix for this issue?
It's been merged into master #1047
As far as I understand new submission for v2.0 are to use the r2.0 branch. Would the fix be merged there as well
@shairoz-deci According to branch history of r2.0 branch, it looks like @LukeIreland1 's fix is included. Could you try the latest r2.0 branch?
@nvpohanh thank you for the quick response. I am using the the latest r2.0 branch, running inside the docker of course.
The exact execution and error:
shai.rozenberg@mlperf-inference-bert-shai:/workspace$ python3 run.py --backend=onnxruntime --accuracy
File "run.py", line 120, in <module>
main()
File "run.py", line 75, in main
from onnxruntime_SUT import get_onnxruntime_sut
File "/workspace/onnxruntime_SUT.py", line 27, in <module>
from squad_QSL import get_squad_QSL
File "/workspace/squad_QSL.py", line 23, in <module>
from create_squad_data import read_squad_examples, convert_examples_to_features
File "/workspace/create_squad_data.py", line 24, in <module>
import tokenization
ModuleNotFoundError: No module named 'tokenization'
@nvpohanh thank you for the quick response. I am using the the latest r2.0 branch, running inside the docker of course.
The exact execution and error:
shai.rozenberg@mlperf-inference-bert-shai:/workspace$ python3 run.py --backend=onnxruntime --accuracyFile "run.py", line 120, in <module> main() File "run.py", line 75, in main from onnxruntime_SUT import get_onnxruntime_sut File "/workspace/onnxruntime_SUT.py", line 27, in <module> from squad_QSL import get_squad_QSL File "/workspace/squad_QSL.py", line 23, in <module> from create_squad_data import read_squad_examples, convert_examples_to_features File "/workspace/create_squad_data.py", line 24, in <module> import tokenization ModuleNotFoundError: No module named 'tokenization'
My fix was only for tensorflow and pytorch, but I will take a look at onnx and see if I can make a suitable fix using the same approach.
@shairoz-deci Okay, after taking a look, there is no "tokenization.py" for onnx, just tf and pytorch, which I guess is ultimately why I only added those options, but it should really throw an error if neither is installed, as accuracy mode requires tokenization, which requires at least one of tensorflow, or pytorch, which really shouldn't be the case given my original patch was supposed to decouple the modes from a given framework...
A temporary work around (or perhaps the actual intended solution) is to install either PyTorch or TensorFlow, but I should really make accuracy mode completely framework agnostic by using a universal "tokenization.py". I'll take a look into the feasibility of this. It may also be that you may need to provide your own onnx "tokenization.py", as I have no experience with that framework.
@LukeIreland1 We intend to submit all our models in onnx, quantized to INT8. As such they will lose accuracy from the corresponding "pytorch" used to export them. Can you explain what it means to "create my own tokenization.py" going from text to prediction with onnx models is fairly simple as can be seen in the following example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/notebooks/bert/Bert-GLUE_OnnxRuntime_quantization.ipynb (section 2.3)
@shairoz-deci, The tf/pytorch point was regarding the fact the import that causes you an error needs an existing "tokenization.py" to be found in the path, and the only existing ones were in folders labelled "TensorFlow" and "PyTorch", but the files in question don't play any part in inference, and fortunately they're both already framework agnostic anyway.
I'm going to get a patch reviewed (that should work) then will share it with you afterwards.
@shairoz-deci, It'll be quicker and easier (as DeepLearningExamples is a different project, and I'm working for Arm who don't have project contribution guidelines for that project), if you copy the "tokenization.py" from TensorFlow to "language/bert", remove the "import tensorflow as tf" line, and remove the lines that check if tensorflow or pytorch are installed from "accuracy-squad.py".
Eventually, I may, either as an employee or an individual, do this myself, and open a PR, but as I said, if you do this yourself for the time being, it'll be the faster option.
thanks @LukeIreland1 I copied the tokenization.py as you mentioned and the package loading part seems to be working. However, the test now seems to be stuck on "Running LoadGen test..."
python3 run.py --backend=onnxruntime --accuracy
Loading ONNX model...
2022-02-02 10:45:34.456389992 [W:onnxruntime:, graph.cc:2413 CleanUnusedInitializers] Removing initializer 'bert.pooler.dense.weight'. It is not used by any node and should be removed from the model.
2022-02-02 10:45:34.456487738 [W:onnxruntime:, graph.cc:2413 CleanUnusedInitializers] Removing initializer 'bert.pooler.dense.bias'. It is not used by any node and should be removed from the model.
Constructing SUT...
Finished constructing SUT.
Constructing QSL...
Loading cached features from 'eval_features.pickle'...
Finished constructing QSL.
Running LoadGen test...
Tried running the pytorch and got:
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
Segmentation fault (core dumped)
I assume because the GPU I'm working on isn't supported by the docker container package versions When I try running the pytorch on CPU, a different crash
CUDA_VISIBLE_DEVICES='' python3 run.py --backend=pytorch
Running LoadGen test...
Traceback (most recent call last):
File "run.py", line 120, in <module>
main()
File "run.py", line 101, in main
lg.StartTestWithLogSettings(sut.sut, sut.qsl.qsl, settings, log_settings)
File "/workspace/pytorch_SUT.py", line 70, in issue_queries
start_scores = model_output.start_logits
AttributeError: 'tuple' object has no attribute 'start_logits'
Segmentation fault (core dumped)
Should the pytorch support CPU executions?
thanks @LukeIreland1 I copied the tokenization.py as you mentioned and the package loading part seems to be working. However, the test now seems to be stuck on "Running LoadGen test..." @shairoz-deci,
This to me looks like the right way to run it, even if it's not working/gets stuck, and is definitely outside the scope of this issue. Could you try with --count 1 or some other small number and see if it completes?
with a small --max_examples the onnxruntime does work. Thanks @LukeIreland1 should anything else regarding this comes up I will open a new issue.
outdated