NLP_scripts icon indicating copy to clipboard operation
NLP_scripts copied to clipboard

How to load the fine-tuned model after save it local.

Open Zhuifeng414 opened this issue 2 years ago • 2 comments

I noticed the fine-tuned roberta script saved the fine-tuned model locally by

model_to_save = model
torch.save(model_to_save, output_model_file)
tokenizer.save_vocabulary(output_vocab_file)

How to load this model from the local folder? I try this:

from transformers import RobertaModel, RobertaTokenizer
# from the local folder
roberta_model = RobertaModel.from_pretrained("./")
roberta_tokenizer = RobertaTokenizer.from_pretrained('./', truncation=True, do_lower_case=True)

But failed with error: OSError: does not appear to have a file named config.json. Checkout 'https://huggingface.co/.//main' for available files.

Zhuifeng414 avatar Aug 30 '23 19:08 Zhuifeng414

You are missing your config.json file. Make sure you have the files placed at the correct location.

DhavalTaunk08 avatar Sep 07 '23 04:09 DhavalTaunk08

Some weights of RobertaModel were not initialized from the model checkpoint at roberta-base and are newly initialized: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

RuntimeError Traceback (most recent call last) Cell In[47], line 2 1 model = RobertaClass() ----> 2 model.to(device)

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1160, in Module.to(self, *args, **kwargs) 1156 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, 1157 non_blocking, memory_format=convert_to_format) 1158 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) -> 1160 return self._apply(convert)

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:810, in Module._apply(self, fn, recurse) 808 if recurse: 809 for module in self.children(): --> 810 module._apply(fn) 812 def compute_should_use_set_data(tensor, tensor_applied): 813 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): 814 # If the new tensor has compatible tensor type as the existing tensor, 815 # the current behavior is to change the tensor in-place using .data =, (...) 820 # global flag to let the user control whether they want the future 821 # behavior of overwriting the existing tensor or not.

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:810, in Module._apply(self, fn, recurse) 808 if recurse: 809 for module in self.children(): --> 810 module._apply(fn) 812 def compute_should_use_set_data(tensor, tensor_applied): 813 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): 814 # If the new tensor has compatible tensor type as the existing tensor, 815 # the current behavior is to change the tensor in-place using .data =, (...) 820 # global flag to let the user control whether they want the future 821 # behavior of overwriting the existing tensor or not.

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:810, in Module._apply(self, fn, recurse) 808 if recurse: 809 for module in self.children(): --> 810 module._apply(fn) 812 def compute_should_use_set_data(tensor, tensor_applied): 813 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied): 814 # If the new tensor has compatible tensor type as the existing tensor, 815 # the current behavior is to change the tensor in-place using .data =, (...) 820 # global flag to let the user control whether they want the future 821 # behavior of overwriting the existing tensor or not.

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:833, in Module._apply(self, fn, recurse) 829 # Tensors stored in modules are graph leaves, and we don't want to 830 # track autograd history of param_applied, so we have to use 831 # with torch.no_grad(): 832 with torch.no_grad(): --> 833 param_applied = fn(param) 834 should_use_set_data = compute_should_use_set_data(param, param_applied) 835 if should_use_set_data:

File /opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py:1158, in Module.to..convert(t) 1155 if convert_to_format is not None and t.dim() in (4, 5): 1156 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, 1157 non_blocking, memory_format=convert_to_format) -> 1158 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions. Got it while fine-tuning robertra

staru09 avatar Mar 24 '24 11:03 staru09