HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name'
deepspeed training error, how to deal with it.
!deepspeed --num_gpus=1
--module training.trainer
--input-model /content/drive/MyDrive/EleutherAI/pythia-6.9b
--deepspeed config/ds_z3_bf16_config.json
--epochs 2
--local-output-dir /content/drive/MyDrive/dolly/training_out
--dbfs-output-dir /content/drive/MyDrive/dolly/dolly_training
--per-device-train-batch-size 6
--per-device-eval-batch-size 6
--logging-steps 10
--save-steps 200
--save-total-limit 20
--eval-steps 50
--warmup-steps 50
--test-size 200
--lr 5e-6
HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/content/drive/MyDrive/EleutherAI/pythia-6.9b'. Use repo_type argument if needed.
Just pass --input-model EleutherAI/pythia-6.9b
Are you sure that path exists and contains the model? Usually I’ve seen that when the model does not exist at that path or some expected files are missing.
Just pass
--input-model EleutherAI/pythia-6.9bThis answer is right
How about my model in local has 3 level directories?
HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name':
What do you mean 3 dirs? your model exists in a directory on a file system locally. You pass the path to that dir.
What do you mean 3 dirs? your model exists in a directory on a file system locally. You pass the path to that dir.
It means /dir1/dir2/dir3/pythin1milion-7.2b and it says HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/dir1/dir2/dir3/pythin1milion-7.2b'. Use repo_type argument if needed.
It's saying the directory doesn't contain what it expects. It should look something like https://huggingface.co/databricks/dolly-v2-12b/tree/main
What do you mean 3 dirs? your model exists in a directory on a file system locally. You pass the path to that dir.
It means /dir1/dir2/dir3/pythin1milion-7.2b and it says HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/dir1/dir2/dir3/pythin1milion-7.2b'. Use repo_type argument if needed.
Hey were you able to resolve this issue? I encounter the same issue while invoking a pre-trained model saved in S3 bucket.
I too get the same error while trying to run it from the locally saved model in databricks. I can see the path have files but I see some differences and don't know why because I have followed the steps in the Readme file to generate that model on local.
Snapshot of the files that I have is as below:
I was able to run that successfully by changing the path to '/dbfs/dolly_training/dolly_2023-06-29T17:40:14' but it does not recognize 'dbfs:/dolly_training/dolly_2023-06-29T17:40:14' .. It internally check whether the path passed is a directory or not and uses below code : import os dir = "dbfs:/dolly_training" os.path.isdir(dir)
This returns false if you pass a directory which has dbfs:/ but does not retun false when you pass /dbfs/ path.
Correct, you use /dbfs not dbfs: with tools that only know local files. But I am not clear what you are saying the root cause was
The point that I was trying to say is even though the path "dbfs:/dolly_training/" is a directory and when you check that whether it is a directory or not using os.path.isdir("dbfs:/dolly_training") it returns false and that is what was causing the confusion in terms of even though you have passed the correct path where you have saved your model and you can see the files on that location and still are not able to run the model using that path.
Just pass
--input-model EleutherAI/pythia-6.9bThis answer is right
I am testing and running train_dolly, first i got the same issue as above, i.e. (Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name'), and tried to do as advised here but I get a new error: "SyntaxError: invalid decimal literal" What am I doing wrong? thanks for your help
Not sure, you're not showing where that error arises
Not sure, you're not showing where that error arises
I tried with this to pass the input to the model "from training.generate import generate_response, load_model_tokenizer_for_generate pass --input-model EleutherAI/pythia-6.9b model, tokenizer = load_model_tokenizer_for_generate(dbfs_output_dir)"
I also tried it as first command I Tried also to leave out "pass" but all resulted to the same error
I'm not sure what you mean; that's a command line argument but you seem to be referring to code.
I'm not sure what you mean; that's a command line argument but you seem to be referring to code. c:\Users\seshsan...\databricks_cli_0.200.1_windows_amd64>databricks version Databricks CLI v0.200.1
c:\Users\seshsan...\databricks_cli_0.200.1_windows_amd64>--input-model EleutherAI/pythia-6.9b '--input-model' is not recognized as an internal or external command, operable program or batch file.
c:\Users\seshsan...\databricks_cli_0.200.1_windows_amd64>pass --input-model EleutherAI/pythia-6.9b 'pass' is not recognized as an internal or external command, operable program or batch file.
Er, you're just typing those words on the command line! --input-model is a flag you pass to deepspeed
If you're trying to load the model checkpoints from local machine, give the FULL directory path where the model checkpoints are saved.
FYI, passing the relative path like: "~/models/7B" => fails.
Giving the full path like this: "/home/users/abc/models/7B" => this works fine.
This worked for me.
Are you sure that path exists and contains the model? Usually I’ve seen that when the model does not exist at that path or some expected files are missing.
Thank you @matthayes, indeed the problem was the directory path
If you're trying to load the model checkpoints from local machine, give the FULL directory path where the model checkpoints are saved. FYI, passing the relative path like: "
~/models/7B" => fails. Giving the full path like this: "/home/users/abc/models/7B" => this works fine. This worked for me.
Had the same problem and full path was able to resolve it, thanks.
I've seen this error and I had the wrong path specified. Run a simple ls command to verify.
Hey I would like to add to this discussion since it came up on google when I was searching this error. If you're like me and writing a python flask API with these tools, then the path be in reference to the folder where you defined your Flask app instance, and not relative to the file where the route calling the model is. So for my situation I had my Flask app defined in "./server.py" (root), and my models saved in "./models/model_name". I was trying to call the model from a route in "./flaskApi/route.py". So originally I was calling from_pretrained("./../models/model_name") which resulted in the above error, and once I eventually changed it to from_pretrained("./models/model_name") it works as expected