dolly HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo

deepspeed training error, how to deal with it. !deepspeed --num_gpus=1
--module training.trainer
--input-model /content/drive/MyDrive/EleutherAI/pythia-6.9b
--deepspeed config/ds_z3_bf16_config.json
--epochs 2
--local-output-dir /content/drive/MyDrive/dolly/training_out
--dbfs-output-dir /content/drive/MyDrive/dolly/dolly_training
--per-device-train-batch-size 6
--per-device-eval-batch-size 6
--logging-steps 10
--save-steps 200
--save-total-limit 20
--eval-steps 50
--warmup-steps 50
--test-size 200
--lr 5e-6 HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/content/drive/MyDrive/EleutherAI/pythia-6.9b'. Use repo_type argument if needed.

Apr 21 '23 10:04 hyb1234hi

Just pass --input-model EleutherAI/pythia-6.9b

Apr 21 '23 15:04 srowen

Are you sure that path exists and contains the model? Usually I’ve seen that when the model does not exist at that path or some expected files are missing.

Apr 22 '23 02:04 matthayes

Just pass --input-model EleutherAI/pythia-6.9b This answer is right

Apr 22 '23 05:04 hyb1234hi

How about my model in local has 3 level directories?

HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name':

May 23 '23 08:05 lucasjinreal

What do you mean 3 dirs? your model exists in a directory on a file system locally. You pass the path to that dir.

May 23 '23 11:05 srowen

What do you mean 3 dirs? your model exists in a directory on a file system locally. You pass the path to that dir.

It means /dir1/dir2/dir3/pythin1milion-7.2b and it says HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/dir1/dir2/dir3/pythin1milion-7.2b'. Use repo_type argument if needed.

Jun 02 '23 01:06 MilesQLi

It's saying the directory doesn't contain what it expects. It should look something like https://huggingface.co/databricks/dolly-v2-12b/tree/main

Jun 02 '23 01:06 srowen

What do you mean 3 dirs? your model exists in a directory on a file system locally. You pass the path to that dir.

It means /dir1/dir2/dir3/pythin1milion-7.2b and it says HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/dir1/dir2/dir3/pythin1milion-7.2b'. Use repo_type argument if needed.

Hey were you able to resolve this issue? I encounter the same issue while invoking a pre-trained model saved in S3 bucket.

Jun 19 '23 07:06 deepakkrish91

I too get the same error while trying to run it from the locally saved model in databricks. I can see the path have files but I see some differences and don't know why because I have followed the steps in the Readme file to generate that model on local.

Snapshot of the files that I have is as below: DollyError

Jul 03 '23 07:07 nikunjkakadiya2010

I was able to run that successfully by changing the path to '/dbfs/dolly_training/dolly_2023-06-29T17:40:14' but it does not recognize 'dbfs:/dolly_training/dolly_2023-06-29T17:40:14' .. It internally check whether the path passed is a directory or not and uses below code : import os dir = "dbfs:/dolly_training" os.path.isdir(dir)

This returns false if you pass a directory which has dbfs:/ but does not retun false when you pass /dbfs/ path.

Jul 03 '23 12:07 nikunjkakadiya2010

Correct, you use /dbfs not dbfs: with tools that only know local files. But I am not clear what you are saying the root cause was

Jul 03 '23 12:07 srowen

The point that I was trying to say is even though the path "dbfs:/dolly_training/" is a directory and when you check that whether it is a directory or not using os.path.isdir("dbfs:/dolly_training") it returns false and that is what was causing the confusion in terms of even though you have passed the correct path where you have saved your model and you can see the files on that location and still are not able to run the model using that path.

Jul 05 '23 05:07 nikunjkakadiya2010

Just pass --input-model EleutherAI/pythia-6.9b This answer is right

I am testing and running train_dolly, first i got the same issue as above, i.e. (Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name'), and tried to do as advised here but I get a new error: "SyntaxError: invalid decimal literal" What am I doing wrong? thanks for your help

Jul 07 '23 13:07 seshsan

Not sure, you're not showing where that error arises

Jul 07 '23 14:07 srowen

Not sure, you're not showing where that error arises

I tried with this to pass the input to the model "from training.generate import generate_response, load_model_tokenizer_for_generate pass --input-model EleutherAI/pythia-6.9b model, tokenizer = load_model_tokenizer_for_generate(dbfs_output_dir)"

I also tried it as first command I Tried also to leave out "pass" but all resulted to the same error

Jul 07 '23 14:07 seshsan

I'm not sure what you mean; that's a command line argument but you seem to be referring to code.

Jul 07 '23 14:07 srowen

I'm not sure what you mean; that's a command line argument but you seem to be referring to code. c:\Users\seshsan...\databricks_cli_0.200.1_windows_amd64>databricks version Databricks CLI v0.200.1

c:\Users\seshsan...\databricks_cli_0.200.1_windows_amd64>--input-model EleutherAI/pythia-6.9b '--input-model' is not recognized as an internal or external command, operable program or batch file.

c:\Users\seshsan...\databricks_cli_0.200.1_windows_amd64>pass --input-model EleutherAI/pythia-6.9b 'pass' is not recognized as an internal or external command, operable program or batch file.

Jul 07 '23 15:07 seshsan

Er, you're just typing those words on the command line! --input-model is a flag you pass to deepspeed

Jul 07 '23 15:07 srowen

If you're trying to load the model checkpoints from local machine, give the FULL directory path where the model checkpoints are saved. FYI, passing the relative path like: "~/models/7B" => fails. Giving the full path like this: "/home/users/abc/models/7B" => this works fine. This worked for me.

Jul 20 '23 10:07 chintan-donda

Are you sure that path exists and contains the model? Usually I’ve seen that when the model does not exist at that path or some expected files are missing.

Thank you @matthayes, indeed the problem was the directory path

Aug 25 '23 12:08 JessicaLopezEspejel

If you're trying to load the model checkpoints from local machine, give the FULL directory path where the model checkpoints are saved. FYI, passing the relative path like: "~/models/7B" => fails. Giving the full path like this: "/home/users/abc/models/7B" => this works fine. This worked for me.

Had the same problem and full path was able to resolve it, thanks.

Sep 28 '23 23:09 gsiisg

I've seen this error and I had the wrong path specified. Run a simple ls command to verify.

Oct 10 '23 06:10 eruzanski

Hey I would like to add to this discussion since it came up on google when I was searching this error. If you're like me and writing a python flask API with these tools, then the path be in reference to the folder where you defined your Flask app instance, and not relative to the file where the route calling the model is. So for my situation I had my Flask app defined in "./server.py" (root), and my models saved in "./models/model_name". I was trying to call the model from a route in "./flaskApi/route.py". So originally I was calling from_pretrained("./../models/model_name") which resulted in the above error, and once I eventually changed it to from_pretrained("./models/model_name") it works as expected

Oct 13 '23 22:10 ChrisFrankoPhD