Longformer_ZH icon indicating copy to clipboard operation
Longformer_ZH copied to clipboard

Sequence length should be multiple of 512. It can't directly used for encoding

Open SouthWindShiB opened this issue 3 years ago • 1 comments

File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\bert\modeling_bert.py", line 1068, in forward return_dict=return_dict, File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\bert\modeling_bert.py", line 591, in forward output_attentions, File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\bert\modeling_bert.py", line 476, in forward past_key_value=self_attn_past_key_value, File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\bert\modeling_bert.py", line 408, in forward output_attentions, File "D:\Anaconda\envs\torch_1.7\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "I:\PycharmProject\zh_efficient-autogressive-EL\model\Longformer_zh.py", line 21, in forward output_attentions=output_attentions) File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\longformer\modeling_longformer.py", line 591, in forward query_vectors, key_vectors, self.one_sided_attn_window_size File "D:\Anaconda\envs\torch_1.7\lib\site-packages\transformers\models\longformer\modeling_longformer.py", line 803, in _sliding_chunks_query_key_matmul ), f"Sequence length should be multiple of {window_overlap * 2}. Given {seq_len}" AssertionError: Sequence length should be multiple of 512. Given 158

did you miss something that pad the sequence to suitbal length?

SouthWindShiB avatar Nov 15 '21 04:11 SouthWindShiB

The structure of Longformer Attention Windows makes your input sequence length must be the multiple of windows length. To use it, you can pad your input sequence to 512 or 1024 and give the model correct input attention mask.

ValkyriaLenneth avatar Nov 23 '21 03:11 ValkyriaLenneth