rust-bert icon indicating copy to clipboard operation
rust-bert copied to clipboard

When label mapping aren't provided - we get a crash

Open jondot opened this issue 2 years ago • 1 comments

As opposed to transformers where labels are generated ad-hoc

[{'label': 'LABEL_0', 'score': 0.999602735042572}]

To resolve, we might want to add label mapping into SequenceClassificationConfig with some defaults, but it might be a change that's too radical

Another possible fix is to do the same thing as transformers and go:

let label_string = self.label_mapping.get(&id).unwrap_or_else(|| format!("LABEL_{id}")).to_owned();

instead of

let label_string = self.label_mapping.get(&id).unwrap().to_owned();

And then num_labels when no mapping specified, is... magic number 2 https://github.com/huggingface/transformers/blob/95b374952dc27d8511541d6f5a4e22c9ec11fb24/src/transformers/configuration_utils.py#L331

Well not so much magic if you assume a classifier with no other information provided is binary always which is what the python lib seems to do.

Any thoughts?

jondot avatar Sep 11 '23 08:09 jondot

Hello @jondot ,

The label mapping is loaded from the config.json file provided to initialize the model. Do you have an instance of a malformed model configuration that does not contain the label information? While creating labels "on the fly" if they are missing would allow the code to compile and run, the output is not properly form (what is LABEL_0 for the downstream application)?

I'd be in favor of keeping the current set-up to encourage user to provide a valid configuration, maybe additional documentation/hints for the error thrown would be helpful?

guillaume-be avatar Oct 21 '23 07:10 guillaume-be