pytorch-crf icon indicating copy to clipboard operation
pytorch-crf copied to clipboard

IndexError: index -100 is out of bounds for dimension 0 with size 9

Open venti07 opened this issue 3 years ago • 1 comments
trafficstars

Hello, I am trying to use a BERTCRF model. Unfortunately, the following error message appears: IndexError: index -100 is out of bounds for dimension 0 with size 9

I have a notebook from Transformers Notebooks for token classifiacation as a base and would like to use a BERTCRF Model instead of the AutoModelForTokenClassification. https://huggingface.co/docs/transformers/notebooks

I have set up a notebook and inserted the appropriate BERTCRF models: https://github.com/venti07/share/blob/main/classification_bertcrf.ipynb

Maybe someone can quickly find the error. I would appreciate it very much. Thanks in advance!

venti07 avatar Aug 22 '22 08:08 venti07

During training, it seems you still get the padding index (-100) which is not expected into torch CRF. You need to remove it.

TidorP avatar Sep 08 '22 22:09 TidorP

@TidorP is right. Please set those indices to a value between 0-8 before passing it through the CRF layer. You can restore them afterwards.

kmkurn avatar Sep 25 '22 00:09 kmkurn

Instead of removing, I have tried passing a mask to the CRF. But the problem here is it requires for the first column to be '1'. But the first index is a [CLS] token which has a label of -100 after padding.

How to overcome this?

siddharthtumre avatar Oct 31 '22 14:10 siddharthtumre

@siddharthtumre Just remove the [CLS] token before feeding into the CRF layer. So something like

scores = scores[:, 1:]
tags = tags[:, 1:]

should work (assuming the first dim is the batch size).

kmkurn avatar Nov 01 '22 08:11 kmkurn

I am facing the same error where my labels tensor is [512, 4]. How can I remove the -100 from every batch sample?

atul47B avatar Nov 29 '22 05:11 atul47B

@atul47B You can use something like

is_pad = tags == -100
tags.masked_fill_(is_pad, 0)
loss = -crf(emissions, tags, mask=~is_pad)

The crf forward computation will ignore positions where mask is False regardless of the tag/label value.

kmkurn avatar Dec 02 '22 23:12 kmkurn

Closing because the issue is resolved.

kmkurn avatar Dec 09 '22 23:12 kmkurn