jiant
jiant copied to clipboard
Unable to execute run_simple() with different models of the same type
Describe the bug
When one uses run_simple() with different models of the same type roberta-base and roberta-large the run crashes because the code assumes they are the same model because weights are saved under hf_config.model_type (instead of args.hf_pretrained_model_name_or_path.). As such, the code tries to load incompatible weights and crashes.
To Reproduce
- Install jiant
- Run the simple example in README
- Change the model in the sample from 'roberta-base
toroberta-large`
Expected behavior One should be able to run run_simple() with different models of the same type.
Screenshots If applicable, add screenshots to help explain your problem.
Additional context
Solution: The hf_config.model_type should be used for caching tokenizer / tasks. The args.hf_pretrained_model_name_or_path for the weights.