inference icon indicating copy to clipboard operation
inference copied to clipboard

can't find model "tokenization"

Open mexiQQ opened this issue 4 years ago • 18 comments

language models -> bert -> accuracy-squad.py

import numpy as np 
import six
import tokenization
from transformers import BertTokenizer
from create_squad_data import read_squad_examples, convert_examples_to_features

I follow the steps in the readme file and start a docker container to run Bert evaluation. However, I got the above error, and I even can't find a third module (tokenization) that satisfied the API in the script (accuracy-squad.py).

Can anyone tell me where I can find this module (tokenization)? Thank you very much.

mexiQQ avatar Sep 08 '21 00:09 mexiQQ

tokenization.py is here: https://github.com/NVIDIA/DeepLearningExamples/blob/b03375bd6c2c5233130e61a3be49e26d1a20ac7c/TensorFlow/LanguageModeling/BERT/tokenization.py

Could you add

sys.path.insert(0, "DeepLearningExamples/TensorFlow/LanguageModeling/BERT")

and see if it helps?

nvpohanh avatar Sep 08 '21 02:09 nvpohanh

Hi,

This bug was introduced as part of the decoupling work I did for BERT. I have two fixes, one with and one without the use of env variables. I'm currently on leave until Friday 17th, but my team may upstream either fix in the meantime.

LukeIreland1 avatar Sep 08 '21 08:09 LukeIreland1

@LukeIreland1 Are there any updates to this issue. If there are none, this issue will be closed next week.

rnaidu02 avatar Nov 02 '21 16:11 rnaidu02

@nsircombe, do we still plan to upstream (one of) our fix(es) to BERT accuracy mode?

LukeIreland1 avatar Nov 02 '21 16:11 LukeIreland1

do we still plan to upstream (one of) our fix(es) to BERT accuracy mode?

@LukeIreland1 It would be great if you can provide a fix and file a PR so that we can merge it into upstream? Thanks

nvpohanh avatar Nov 03 '21 04:11 nvpohanh

Can you please share (or merge) the fix for this issue?

shairoz-deci avatar Feb 01 '22 12:02 shairoz-deci

Can you please share (or merge) the fix for this issue?

It's been merged into master #1047

LukeIreland1 avatar Feb 01 '22 12:02 LukeIreland1

As far as I understand new submission for v2.0 are to use the r2.0 branch. Would the fix be merged there as well

shairoz-deci avatar Feb 01 '22 12:02 shairoz-deci

@shairoz-deci According to branch history of r2.0 branch, it looks like @LukeIreland1 's fix is included. Could you try the latest r2.0 branch?

nvpohanh avatar Feb 01 '22 12:02 nvpohanh

@nvpohanh thank you for the quick response. I am using the the latest r2.0 branch, running inside the docker of course.

The exact execution and error:

shai.rozenberg@mlperf-inference-bert-shai:/workspace$ python3 run.py --backend=onnxruntime --accuracy
  File "run.py", line 120, in <module>
    main()
  File "run.py", line 75, in main
    from onnxruntime_SUT import get_onnxruntime_sut
  File "/workspace/onnxruntime_SUT.py", line 27, in <module>
    from squad_QSL import get_squad_QSL
  File "/workspace/squad_QSL.py", line 23, in <module>
    from create_squad_data import read_squad_examples, convert_examples_to_features
  File "/workspace/create_squad_data.py", line 24, in <module>
    import tokenization
ModuleNotFoundError: No module named 'tokenization'

shairoz-deci avatar Feb 01 '22 12:02 shairoz-deci

@nvpohanh thank you for the quick response. I am using the the latest r2.0 branch, running inside the docker of course.

The exact execution and error:

shai.rozenberg@mlperf-inference-bert-shai:/workspace$ python3 run.py --backend=onnxruntime --accuracy
  File "run.py", line 120, in <module>
    main()
  File "run.py", line 75, in main
    from onnxruntime_SUT import get_onnxruntime_sut
  File "/workspace/onnxruntime_SUT.py", line 27, in <module>
    from squad_QSL import get_squad_QSL
  File "/workspace/squad_QSL.py", line 23, in <module>
    from create_squad_data import read_squad_examples, convert_examples_to_features
  File "/workspace/create_squad_data.py", line 24, in <module>
    import tokenization
ModuleNotFoundError: No module named 'tokenization'

My fix was only for tensorflow and pytorch, but I will take a look at onnx and see if I can make a suitable fix using the same approach.

LukeIreland1 avatar Feb 01 '22 13:02 LukeIreland1

@shairoz-deci Okay, after taking a look, there is no "tokenization.py" for onnx, just tf and pytorch, which I guess is ultimately why I only added those options, but it should really throw an error if neither is installed, as accuracy mode requires tokenization, which requires at least one of tensorflow, or pytorch, which really shouldn't be the case given my original patch was supposed to decouple the modes from a given framework...

A temporary work around (or perhaps the actual intended solution) is to install either PyTorch or TensorFlow, but I should really make accuracy mode completely framework agnostic by using a universal "tokenization.py". I'll take a look into the feasibility of this. It may also be that you may need to provide your own onnx "tokenization.py", as I have no experience with that framework.

LukeIreland1 avatar Feb 01 '22 14:02 LukeIreland1

@LukeIreland1 We intend to submit all our models in onnx, quantized to INT8. As such they will lose accuracy from the corresponding "pytorch" used to export them. Can you explain what it means to "create my own tokenization.py" going from text to prediction with onnx models is fairly simple as can be seen in the following example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/notebooks/bert/Bert-GLUE_OnnxRuntime_quantization.ipynb (section 2.3)

shairoz-deci avatar Feb 01 '22 14:02 shairoz-deci

@shairoz-deci, The tf/pytorch point was regarding the fact the import that causes you an error needs an existing "tokenization.py" to be found in the path, and the only existing ones were in folders labelled "TensorFlow" and "PyTorch", but the files in question don't play any part in inference, and fortunately they're both already framework agnostic anyway.

I'm going to get a patch reviewed (that should work) then will share it with you afterwards.

LukeIreland1 avatar Feb 01 '22 15:02 LukeIreland1

@shairoz-deci, It'll be quicker and easier (as DeepLearningExamples is a different project, and I'm working for Arm who don't have project contribution guidelines for that project), if you copy the "tokenization.py" from TensorFlow to "language/bert", remove the "import tensorflow as tf" line, and remove the lines that check if tensorflow or pytorch are installed from "accuracy-squad.py".

Eventually, I may, either as an employee or an individual, do this myself, and open a PR, but as I said, if you do this yourself for the time being, it'll be the faster option.

LukeIreland1 avatar Feb 01 '22 16:02 LukeIreland1

thanks @LukeIreland1 I copied the tokenization.py as you mentioned and the package loading part seems to be working. However, the test now seems to be stuck on "Running LoadGen test..."

python3 run.py --backend=onnxruntime --accuracy
Loading ONNX model...
2022-02-02 10:45:34.456389992 [W:onnxruntime:, graph.cc:2413 CleanUnusedInitializers] Removing initializer 'bert.pooler.dense.weight'. It is not used by any node and should be removed from the model.
2022-02-02 10:45:34.456487738 [W:onnxruntime:, graph.cc:2413 CleanUnusedInitializers] Removing initializer 'bert.pooler.dense.bias'. It is not used by any node and should be removed from the model.
Constructing SUT...
Finished constructing SUT.
Constructing QSL...
Loading cached features from 'eval_features.pickle'...
Finished constructing QSL.
Running LoadGen test...

Tried running the pytorch and got:

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
Segmentation fault (core dumped)

I assume because the GPU I'm working on isn't supported by the docker container package versions When I try running the pytorch on CPU, a different crash

 CUDA_VISIBLE_DEVICES='' python3 run.py --backend=pytorch
Running LoadGen test...
Traceback (most recent call last):
  File "run.py", line 120, in <module>
    main()
  File "run.py", line 101, in main
    lg.StartTestWithLogSettings(sut.sut, sut.qsl.qsl, settings, log_settings)
  File "/workspace/pytorch_SUT.py", line 70, in issue_queries
    start_scores = model_output.start_logits
AttributeError: 'tuple' object has no attribute 'start_logits'
Segmentation fault (core dumped)

Should the pytorch support CPU executions?

shairoz-deci avatar Feb 02 '22 09:02 shairoz-deci

thanks @LukeIreland1 I copied the tokenization.py as you mentioned and the package loading part seems to be working. However, the test now seems to be stuck on "Running LoadGen test..." @shairoz-deci,

This to me looks like the right way to run it, even if it's not working/gets stuck, and is definitely outside the scope of this issue. Could you try with --count 1 or some other small number and see if it completes?

LukeIreland1 avatar Feb 02 '22 10:02 LukeIreland1

with a small --max_examples the onnxruntime does work. Thanks @LukeIreland1 should anything else regarding this comes up I will open a new issue.

shairoz-deci avatar Feb 02 '22 10:02 shairoz-deci

outdated

mrasquinha-g avatar May 23 '23 11:05 mrasquinha-g