transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Finetuning RAG-Getting git.exc.InvalidGitRepositoryError

Open harithareddy84 opened this issue 2 years ago • 2 comments

System Info

  • transformers version: 4.26.0
  • Platform: Linux-5.15.0-57-generic-x86_64-with-glibc2.27
  • Python version: 3.10.8
  • Huggingface_hub version: 0.12.0
  • PyTorch version (GPU?): 1.13.1 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

For finetuning,I ran below command !python "finetune_rag.py"
--data_dir "RAGData"
--output_dir "output_ft"
--model_name_or_path facebook/rag-token-base
--model_type rag_token
--distributed_retriever pytorch
--gpus 2
--do_train
--do_predict

Getting following error: The tokenizer class you load from this checkpoint is 'RagTokenizer'. The class this function is called from is 'BartTokenizerFast'. Traceback (most recent call last): File "/workspace/finetune_rag.py", line 649, in main(args) File "/workspace/finetune_rag.py", line 586, in main model: GenerativeQAModule = GenerativeQAModule(args) File "/workspace/finetune_rag.py", line 157, in init save_git_info(self.hparams.output_dir) File "/workspace/utils_rag.py", line 145, in save_git_info repo_infos = get_git_info() File "/workspace/utils_rag.py", line 161, in get_git_info repo = git.Repo(path) File "/opt/conda/lib/python3.10/site-packages/git/repo/base.py", line 282, in init self.working_dir: Optional[PathLike] = self._working_tree_dir or self.common_dir File "/opt/conda/lib/python3.10/site-packages/git/repo/base.py", line 363, in common_dir raise InvalidGitRepositoryError() git.exc.InvalidGitRepositoryError

Please help.

Who can help?

@lhoestq

Information

  • [ ] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

For finetuning,I ran below command in Jupyter notebook !python "finetune_rag.py"
--data_dir "RAGData"
--output_dir "output_ft"
--model_name_or_path facebook/rag-token-base
--model_type rag_token
--distributed_retriever pytorch
--gpus 2
--do_train
--do_predict

Expected behavior

Finetuning of RAG

harithareddy84 avatar Feb 10 '23 06:02 harithareddy84

Hi @harithareddy84 👋 It seems like you have a git-related issue, not a transformers-related issue. I'm afraid we won't be able to help -- try searching for the error message :)

gante avatar Feb 10 '23 12:02 gante

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Mar 12 '23 15:03 github-actions[bot]

System Info

  • transformers version: 4.26.0
  • Platform: Linux-5.15.0-57-generic-x86_64-with-glibc2.27
  • Python version: 3.10.8
  • Huggingface_hub version: 0.12.0
  • PyTorch version (GPU?): 1.13.1 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

For finetuning,I ran below command !python "finetune_rag.py" --data_dir "RAGData" --output_dir "output_ft" --model_name_or_path facebook/rag-token-base --model_type rag_token --distributed_retriever pytorch --gpus 2 --do_train --do_predict

Getting following error: The tokenizer class you load from this checkpoint is 'RagTokenizer'. The class this function is called from is 'BartTokenizerFast'. Traceback (most recent call last): File "/workspace/finetune_rag.py", line 649, in main(args) File "/workspace/finetune_rag.py", line 586, in main model: GenerativeQAModule = GenerativeQAModule(args) File "/workspace/finetune_rag.py", line 157, in init save_git_info(self.hparams.output_dir) File "/workspace/utils_rag.py", line 145, in save_git_info repo_infos = get_git_info() File "/workspace/utils_rag.py", line 161, in get_git_info repo = git.Repo(path) File "/opt/conda/lib/python3.10/site-packages/git/repo/base.py", line 282, in init self.working_dir: Optional[PathLike] = self._working_tree_dir or self.common_dir File "/opt/conda/lib/python3.10/site-packages/git/repo/base.py", line 363, in common_dir raise InvalidGitRepositoryError() git.exc.InvalidGitRepositoryError

Please help.

Who can help?

@lhoestq

Information

  • [ ] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

For finetuning,I ran below command in Jupyter notebook !python "finetune_rag.py" --data_dir "RAGData" --output_dir "output_ft" --model_name_or_path facebook/rag-token-base --model_type rag_token --distributed_retriever pytorch --gpus 2 --do_train --do_predict

Expected behavior

Finetuning of RAG

Hi! Were you able to resolve the error?

mujeeb-gh avatar Apr 04 '24 04:04 mujeeb-gh