qlora
qlora copied to clipboard
getting error on loading model
When i try this in colab tokenizer = AutoTokenizer.from_pretrained("emilianJR/CyberRealistic_V3") model = AutoModelForCausalLM.from_pretrained("emilianJR/CyberRealistic_V3")
HTTPError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py in hf_raise_for_status(response, endpoint_name) 258 try: --> 259 response.raise_for_status() 260 except HTTPError as e:
12 frames HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/emilianJR/CyberRealistic_V3/resolve/main/config.json
The above exception was the direct cause of the following exception:
EntryNotFoundError Traceback (most recent call last) EntryNotFoundError: 404 Client Error. (Request ID: Root=1-6482c26e-527cb8985586f80e2bd7bbc6)
Entry Not Found for url: https://huggingface.co/emilianJR/CyberRealistic_V3/resolve/main/config.json.
During handling of the above exception, another exception occurred:
OSError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py in cached_file(path_or_repo_id, filename, cache_dir, force_download, resume_download, proxies, use_auth_token, revision, local_files_only, subfolder, repo_type, user_agent, _raise_exceptions_for_missing_entries, _raise_exceptions_for_connection_errors, _commit_hash) 461 if revision is None: 462 revision = "main" --> 463 raise EnvironmentError( 464 f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout " 465 f"'https://huggingface.co/{path_or_repo_id}/{revision}' for available files."
OSError: emilianJR/CyberRealistic_V3 does not appear to have a file named config.json. Checkout 'https://huggingface.co/emilianJR/CyberRealistic_V3/main' for available files.
When i try to tune !python /content/qlora/qlora.py https://huggingface.co/emilianJR/CyberRealistic_V3/resolve/main/unet/diffusion_pytorch_model.bin /content/new
getting this error ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run
python -m bitsandbytes
and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
bin /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: /usr/lib64-nvidia did not contain ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] as expected! Searching further paths...
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/sys/fs/cgroup/memory.events /var/colab/cgroup/jupyter-children/memory.events')}
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('8013'), PosixPath('http'), PosixPath('//172.28.0.1')}
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('--logtostderr --listen_host=172.28.0.12 --target_host=172.28.0.12 --tunnel_background_save_url=https'), PosixPath('//colab.research.google.com/tun/m/cc48301118ce562b961b3c22d803539adc1e0c19/gpu-t4-s-tfe6ei5vsc3s --tunnel_background_save_delay=10s --tunnel_periodic_background_save_frequency=30m0s --enable_output_coalescing=true --output_coalescing_required=true')}
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/env/python')}
warn(msg)
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('//ipykernel.pylab.backend_inline'), PosixPath('module')}
warn(msg)
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths...
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cuda_setup/main.py:149: UserWarning: Found duplicate ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] files: {PosixPath('/usr/local/cuda/lib64/libcudart.so.11.0'), PosixPath('/usr/local/cuda/lib64/libcudart.so')}.. We'll flip a coin and try one of these, in order to fail forward.
Either way, this might cause trouble in the future:
If you get CUDA error: invalid device function
errors, the above might be the cause and the solution is to make sure only one ['libcudart.so', 'libcudart.so.11.0', 'libcudart.so.12.0'] in the paths that we search based on your env.
warn(msg)
CUDA SETUP: CUDA runtime path found: /usr/local/cuda/lib64/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 7.5
CUDA SETUP: Detected CUDA version 118
CUDA SETUP: Loading binary /usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cuda118.so...
2023-06-09 06:13:31.514413: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
loading base model EleutherAI/pythia-12b...
The model weights are not tied. Please use the tie_weights
method before using the infer_auto_device
function.
Loading checkpoint shards: 0% 0/3 [00:00<?, ?it/s]
I got the same issue ><
following
Same
Found a big help. If you use oobabooga it is probably because you downloaded it per zip. Just do:
git clone https://github.com/oobabooga/text-generation-webui cd text-generation-webui pip install -r requirements.txt
conda activate textgen cd text-generation-webui pip install -r requirements.txt --upgrade
python download-model.py facebook/opt-1.3b
And then it should work without any warnings or issues
same error tried multiple machines on vast.ai they all give this error
The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/root/qlora/qlora.py", line 807, in <module>
train()
File "/root/qlora/qlora.py", line 643, in train
model = get_accelerate_model(args, checkpoint_dir)
File "/root/qlora/qlora.py", line 280, in get_accelerate_model
model = AutoModelForCausalLM.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained
) = cls._load_pretrained_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model
new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model
set_module_quantized_tensor_to_device(
File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device
new_value = value.to(device)
File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init
torch._C._cuda_init()
RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW
same error tried multiple machines on vast.ai they all give this error
The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/qlora/qlora.py", line 807, in <module> train() File "/root/qlora/qlora.py", line 643, in train model = get_accelerate_model(args, checkpoint_dir) File "/root/qlora/qlora.py", line 280, in get_accelerate_model model = AutoModelForCausalLM.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained ) = cls._load_pretrained_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model set_module_quantized_tensor_to_device( File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device new_value = value.to(device) File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init torch._C._cuda_init() RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW
Try this: from transformers import AutoModelForCausalLM, AutoTokenizer
Load the model and tokenizer
model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)
Tie the weights
model.tie_weights()
Now you can use the model for inference without encountering the error
same error tried multiple machines on vast.ai they all give this error
The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/qlora/qlora.py", line 807, in <module> train() File "/root/qlora/qlora.py", line 643, in train model = get_accelerate_model(args, checkpoint_dir) File "/root/qlora/qlora.py", line 280, in get_accelerate_model model = AutoModelForCausalLM.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained ) = cls._load_pretrained_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model set_module_quantized_tensor_to_device( File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device new_value = value.to(device) File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init torch._C._cuda_init() RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW
Try this: from transformers import AutoModelForCausalLM, AutoTokenizer
Load the model and tokenizer
model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)
Tie the weights
model.tie_weights()
Now you can use the model for inference without encountering the error
This might be a dumb question, but what file are you saying to add this code into?
EDIT Oops, I was just googling how to fix this whole tie_weights from a textgenwebui standpoint and am now seeing that OP was working from a google doc rather than a local machine. Never mind!
If you
same error tried multiple machines on vast.ai they all give this error
The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/qlora/qlora.py", line 807, in <module> train() File "/root/qlora/qlora.py", line 643, in train model = get_accelerate_model(args, checkpoint_dir) File "/root/qlora/qlora.py", line 280, in get_accelerate_model model = AutoModelForCausalLM.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained ) = cls._load_pretrained_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model set_module_quantized_tensor_to_device( File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device new_value = value.to(device) File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init torch._C._cuda_init() RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW
Try this: from transformers import AutoModelForCausalLM, AutoTokenizer
Load the model and tokenizer
model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)
Tie the weights
model.tie_weights()
Now you can use the model for inference without encountering the error
This might be a dumb question, but what file are you saying to add this code into?
EDIT Oops, I was just googling how to fix this whole tie_weights from a textgenwebui standpoint and am now seeing that OP was working from a google doc rather than a local machine. Never mind!
What appears to work for me in oobabooga is Available Extensions: API, gallery Boolean command-line flags: pin_weight
Then when clicking on model (between parameters and training) you select your model (I use Pygmalion 13b), then make sure: wbits: 4 groupsize: 128 model_type: LLAMA autodevices: ON gptq-for-llama: ON
Save settings and reload model
If you
same error tried multiple machines on vast.ai they all give this error
The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function. Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/root/qlora/qlora.py", line 807, in <module> train() File "/root/qlora/qlora.py", line 643, in train model = get_accelerate_model(args, checkpoint_dir) File "/root/qlora/qlora.py", line 280, in get_accelerate_model model = AutoModelForCausalLM.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 484, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 2881, in from_pretrained ) = cls._load_pretrained_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3227, in _load_pretrained_model new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 728, in _load_state_dict_into_meta_model set_module_quantized_tensor_to_device( File "/usr/local/lib/python3.10/dist-packages/transformers/utils/bitsandbytes.py", line 101, in set_module_quantized_tensor_to_device new_value = value.to(device) File "/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py", line 247, in _lazy_init torch._C._cuda_init() RuntimeError: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW
Try this: from transformers import AutoModelForCausalLM, AutoTokenizer
Load the model and tokenizer
model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)
Tie the weights
model.tie_weights()
Now you can use the model for inference without encountering the error
This might be a dumb question, but what file are you saying to add this code into? EDIT Oops, I was just googling how to fix this whole tie_weights from a textgenwebui standpoint and am now seeing that OP was working from a google doc rather than a local machine. Never mind!
What appears to work for me in oobabooga is Available Extensions: API, gallery Boolean command-line flags: pin_weight
Then when clicking on model (between parameters and training) you select your model (I use Pygmalion 13b), then make sure: wbits: 4 groupsize: 128 model_type: LLAMA autodevices: ON gptq-for-llama: ON
Save settings and reload model
Thank you, this ended up working for me as well! I was confused because it was happening with older and newer models I'd downloaded and they definitely did have a config file, so there might be something I did that interferes with them loading properly. I'm sort of blindly stumbling through this like so many others and I appreciate your help!
The above solution did not work for me using alternate models.
All of the previous models I was using no longer load at all.
I have come to the conclusion oobabooga is presently broken.
The above solution did not work for me using alternate models.
All of the previous models I was using no longer load at all.
I have come to the conclusion oobabooga is presently broken.
Reinstall oobabooga, maybe there is an update
I started running into this problem today with normal huggingface models (not working with oobabooga).
Downgrading accelerate from version 0.20.3 to 0.19.0 fixed it for me.
I started running into this problem today with normal huggingface models (not working with oobabooga).
Downgrading accelerate from version 0.20.3 to 0.19.0 fixed it for me.
I tried this. It works without warning.
model_name = "your_model_name" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name)
apologies, but where does this code need to be injected? in server.py?