optimum
optimum copied to clipboard
Raise warning in examples for classification tasks where the model's label mapping can not automatically match with the dataset labels
Following https://github.com/huggingface/optimum/pull/197 , as per the title. For example
cfg = AutoConfig.from_pretrained("howey/bert-base-uncased-sst2")
print(cfg)
"""prints
BertConfig {
"_name_or_path": "howey/bert-base-uncased-sst2",
"architectures": [
"BertForSequenceClassification"
],
"attention_probs_dropout_prob": 0.1,
"classifier_dropout": null,
"finetuning_task": "sst2",
"gradient_checkpointing": false,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"position_embedding_type": "absolute",
"problem_type": "single_label_classification",
"transformers_version": "4.22.0.dev0",
"type_vocab_size": 2,
"use_cache": true,
"vocab_size": 30522
}
"""
print(cfg.label2id)
"""prints
{'LABEL_0': 0, 'LABEL_1': 1}
"""
So although in the config.json there is no label2id, it's actually an attribute of the PretrainedConfig. So the previous check (e.g. if optimizer.model.config.label2id) was not enough to avoid the example scripts failing.
I tried all the examples this time.
Before submitting
- [x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
The documentation is not available anymore as the PR was closed or merged.
I rebased following the changes to ORTQuantizer.
As an example, we prompt for howey/bert-base-uncased-sst2
Model label mapping: {'LABEL_0': 0, 'LABEL_1': 1}
Dataset label features: ClassLabel(num_classes=2, names=['negative', 'positive'], id=None)
Could not guarantee the model label mapping and the dataset labels match. Evaluation results may suffer from a wrong matching.