How to load the fine-tuned model after save it local.
I noticed the fine-tuned roberta script saved the fine-tuned model locally by
model_to_save = model
torch.save(model_to_save, output_model_file)
tokenizer.save_vocabulary(output_vocab_file)
How to load this model from the local folder? I try this:
from transformers import RobertaModel, RobertaTokenizer
# from the local folder
roberta_model = RobertaModel.from_pretrained("./")
roberta_tokenizer = RobertaTokenizer.from_pretrained('./', truncation=True, do_lower_case=True)
But failed with error: OSError: does not appear to have a file named config.json. Checkout 'https://huggingface.co/.//main' for available files.
You are missing your config.json file. Make sure you have the files placed at the correct location.
Some weights of RobertaModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
RuntimeError Traceback (most recent call last) Cell In[47], line 2 1 model = RobertaClass() ----> 2 model.to(device)
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1160, in Module.to(self, *args, **kwargs) 1156 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, 1157 non_blocking, memory_format=convert_to_format) 1158 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) -> 1160 return self._apply(convert)
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:810, in Module._apply(self, fn, recurse)
808 if recurse:
809 for module in self.children():
--> 810 module._apply(fn)
812 def compute_should_use_set_data(tensor, tensor_applied):
813 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
814 # If the new tensor has compatible tensor type as the existing tensor,
815 # the current behavior is to change the tensor in-place using .data =,
(...)
820 # global flag to let the user control whether they want the future
821 # behavior of overwriting the existing tensor or not.
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:810, in Module._apply(self, fn, recurse)
808 if recurse:
809 for module in self.children():
--> 810 module._apply(fn)
812 def compute_should_use_set_data(tensor, tensor_applied):
813 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
814 # If the new tensor has compatible tensor type as the existing tensor,
815 # the current behavior is to change the tensor in-place using .data =,
(...)
820 # global flag to let the user control whether they want the future
821 # behavior of overwriting the existing tensor or not.
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:810, in Module._apply(self, fn, recurse)
808 if recurse:
809 for module in self.children():
--> 810 module._apply(fn)
812 def compute_should_use_set_data(tensor, tensor_applied):
813 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
814 # If the new tensor has compatible tensor type as the existing tensor,
815 # the current behavior is to change the tensor in-place using .data =,
(...)
820 # global flag to let the user control whether they want the future
821 # behavior of overwriting the existing tensor or not.
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:833, in Module._apply(self, fn, recurse)
829 # Tensors stored in modules are graph leaves, and we don't want to
830 # track autograd history of param_applied, so we have to use
831 # with torch.no_grad():
832 with torch.no_grad():
--> 833 param_applied = fn(param)
834 should_use_set_data = compute_should_use_set_data(param, param_applied)
835 if should_use_set_data:
File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1158, in Module.to.
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
Got it while fine-tuning robertra