jiant icon indicating copy to clipboard operation
jiant copied to clipboard

Unable to execute run_simple() with different models of the same type

Open TimDettmers opened this issue 4 years ago • 0 comments

Describe the bug

When one uses run_simple() with different models of the same type roberta-base and roberta-large the run crashes because the code assumes they are the same model because weights are saved under hf_config.model_type (instead of args.hf_pretrained_model_name_or_path.). As such, the code tries to load incompatible weights and crashes.

To Reproduce

  1. Install jiant
  2. Run the simple example in README
  3. Change the model in the sample from 'roberta-basetoroberta-large`

Expected behavior One should be able to run run_simple() with different models of the same type.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Solution: The hf_config.model_type should be used for caching tokenizer / tasks. The args.hf_pretrained_model_name_or_path for the weights.

TimDettmers avatar Mar 25 '22 15:03 TimDettmers