transformers icon indicating copy to clipboard operation
transformers copied to clipboard

i was trying to create custom tokenizer for some language and got this as error or warning..

Open yes-its-shivam opened this issue 2 years ago • 5 comments

System Info

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
Moving 11 files to the new cache system
0%
0/11 [00:02<?, ?it/s]
There was a problem when trying to move your cache:

  File "C:\Users\shiva\anaconda3\lib\site-packages\transformers\utils\hub.py", line 1127, in <module>
    move_cache()

  File "C:\Users\shiva\anaconda3\lib\site-packages\transformers\utils\hub.py", line 1090, in move_cache
    move_to_new_cache(

  File "C:\Users\shiva\anaconda3\lib\site-packages\transformers\utils\hub.py", line 1047, in move_to_new_cache
    huggingface_hub.file_download._create_relative_symlink(blob_path, pointer_path)

  File "C:\Users\shiva\anaconda3\lib\site-packages\huggingface_hub\file_download.py", line 841, in _create_relative_symlink
    raise OSError(


(Please file an issue at https://github.com/huggingface/transformers/issues/new/choose and copy paste this whole message and we will do our best to help.)

Information

  • [ ] The official example scripts
  • [x] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [x] My own task or dataset (give details below)

Reproduction

#save pretrained model from transformers import PreTrainedTokenizerFast

load the tokenizer in a transformers tokenizer instance

tokenizer = PreTrainedTokenizerFast( tokenizer_object=tokenizer, unk_token='[UNK]', pad_token='[PAD]', cls_token='[CLS]', sep_token='[SEP]', mask_token='[MASK]' )

save the tokenizer

tokenizer.save_pretrained('bert-base-dv-hi')

Expected behavior

print out this 
('bert-base-dv-hi\\tokenizer_config.json',
 'bert-base-dv-hi\\special_tokens_map.json',
 'bert-base-dv-hi\\tokenizer.json')

Checklist

yes-its-shivam avatar Sep 15 '22 08:09 yes-its-shivam

Hey @yes-its-shivam, thanks for reporting! I think this may have to do with our backend trying to create symlinks for the cached files, and failing to do so!

It seems you're running on Windows, which requires developer mode to be activated (or for Python to be run as an administrator).

To enable your device for development, we recommend reading this guide from Microsoft: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development

LysandreJik avatar Sep 15 '22 12:09 LysandreJik

Hi @LysandreJik. As far as I can see, this does not just happen once when moving the cache but also for every new model that you download. That means that for every model that I download I would have to find the Python bin of my venv, run it as admin, then download the model, and then continue my work, or install developer mode for Windows - which also requires admin privileges, and comes with other stuff that I may not wish to enable on my device (like allowing sideloading of unverified third party apps).

As far as I can see it, this change means that anyone who does not have admin privileges on their system (like, using the family computer, using school computers, student laptops in class, etc.) cannot use transformers. I'd love to be wrong about this, but at first glance this seems to put Windows away as an unfavorable child again. Can we try to look for a way around this?

Edit: this is not something I am eager to have to enable:

developer mode warning

BramVanroy avatar Sep 16 '22 08:09 BramVanroy

Thanks for reporting @BramVanroy, I'm currently opening an issue on huggingface_hub so that we may track it.

However, if I'm not mistaken, Developer Mode must be enabled in order to leverage WSL, right? I would believe most developers would choose to use WSL in order to use transformers, but I may have been mistaken on that decision.

LysandreJik avatar Sep 16 '22 17:09 LysandreJik

Opened an issue here to track all related issues: https://github.com/huggingface/huggingface_hub/issues/1062

LysandreJik avatar Sep 16 '22 17:09 LysandreJik

For note, you do not need developer mode for WSL. I'm having the same problem and having to turn on developer mode will kill some of our user base. The warning will intimidate people away from using it.

ebolam avatar Sep 19 '22 00:09 ebolam

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Oct 15 '22 15:10 github-actions[bot]

I think the issue has been solved on the huggingface_hub side, as long as you use the latest version. Please let us know otherwise!

sgugger avatar Oct 17 '22 14:10 sgugger

I think the issue has been solved on the huggingface_hub side, as long as you use the latest version. Please let us know otherwise!

I am using the latest version of Huggingface-hub(0.11.0), but still facing the same issue.

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
Moving 0 files to the new cache system
0it [00:00, ?it/s]
0it [00:00, ?it/s]
There was a problem when trying to write in your cache folder (./tmp/). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.
TRANSFORMERS_CACHE = ./tmp/
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.
Moving 0 files to the new cache system
0it [00:00, ?it/s]
0it [00:00, ?it/s]
There was a problem when trying to write in your cache folder (./tmp/). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.

chenye-814 avatar Nov 23 '22 15:11 chenye-814

@chenye-814 did you figure it out? i am having the same issue, There was a problem when trying to write in your cache folder (/documents). You should set the environment variable TRANSFORMERS_CACHE to a writable directory. I already set the envirometn variable TRANSFORMERS_CACHE =documents

manzanofab avatar Sep 14 '23 04:09 manzanofab