tutorials
tutorials copied to clipboard
(pad_on_left and) ids_tensor error(s) in Dynamic Quantization on BERT tutorial
Tutorial link: https://github.com/pytorch/tutorials/blob/master/intermediate_source/dynamic_quantization_bert_tutorial.rst Version 1.8.0
Using either the Colab version or following the tutorial text locally, it fails in section 3.2:
# Evaluate the original FP32 BERT model
time_model_evaluation(model, configs, tokenizer)
Output from colab version:
/usr/local/lib/python3.7/dist-packages/transformers/data/processors/glue.py:175: FutureWarning: This processor will be removed from the library soon, preprocessing should be handled with the 🤗 Datasets library. You can have a look at this example script for pointers: https://github.com/huggingface/transformers/blob/master/examples/text-classification/run_glue.py
warnings.warn(DEPRECATION_WARNING.format("processor"), FutureWarning)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-14-7f3f2ffdfbf3> in <module>()
8
9 # Evaluate the original FP32 BERT model
---> 10 time_model_evaluation(model, configs, tokenizer)
2 frames
<ipython-input-11-9c9008fc3551> in load_and_cache_examples(args, task, tokenizer, evaluate)
113 pad_on_left=bool(args.model_type in ['xlnet']), # pad on the left for xlnet
114 pad_token=tokenizer.convert_tokens_to_ids([tokenizer.pad_token])[0],
--> 115 pad_token_segment_id=4 if args.model_type in ['xlnet'] else 0,
116 )
117 if args.local_rank in [-1, 0]:
TypeError: glue_convert_examples_to_features() got an unexpected keyword argument 'pad_on_left'
I can workaround this by commenting out the pad
arguments in 2.3:
features = convert_examples_to_features(examples,
tokenizer,
label_list=label_list,
max_length=args.max_seq_length,
output_mode=output_mode,
# pad_on_left=bool(args.model_type in ['xlnet']), # pad on the left for xlnet
# pad_token=tokenizer.convert_tokens_to_ids([tokenizer.pad_token])[0],
# pad_token_segment_id=4 if args.model_type in ['xlnet'] else 0,
)
There is another problem when running through the tutorial locally, when I got to section 3.3:
input_ids = ids_tensor([8, 128], 2)
I got this error:
Traceback (most recent call last):
File "bert-test.py", line 307, in <module>
input_ids = ids_tensor([8, 128], 2)
NameError: name 'ids_tensor' is not defined
The Colab version diverges at this point and doesn't use the ids_tensor
function.
cc @jerryzh168 @jianyuh @z-a-f @vkuzo
UPDATE! The pad_on_left
error occurred as I was using a latest build of PyTorch - 1.9.0a0+git0569f63
or whatever version Colab is using, if I use torch==1.8.0
locally then it does not happen.
BUT, the ids_tensor
issue is still evident in the local version.
Hi @jjohnson-arm , sorry didn't see this previously. I think the tutorial was compatible with HuggingFace in late 2019 / early 2020. We need to update it to the latest version. For ids_tensor
issue, did you see it from colab version?
https://colab.research.google.com/github/pytorch/tutorials/blob/gh-pages/_downloads/dynamic_quantization_bert_tutorial.ipynb
/assigntome
@svekars @jianyuh Can I get edit access to the colab file or should I make a new colab file and copy the contents and then change the link in the documentation?