Roy
Roy copied to clipboard
GPU is required to quantize or run quantize model
Trying to run example and getting error that say GPU is required to quantize or run quantize model
Hey @RealJavascriptKid,
Thanks a lot for reaching out and giving our project a shot! I'm aware that some of the examples in the README might not work as expected due to the project's early stage.
To help me understand the problem better and get you up and running, could you let me know a bit more about your setup and what's going on? Any information on your hardware, software version, error messages, and steps you've taken so far would be helpful.
-
Hardware Setup: Any details about your hardware, like your GPU model, would be great.
-
Software and Environment: What version of the project are you using? And what's your software environment like (e.g., operating system, Python version)?
-
Error Messages: If you're seeing any error messages or weird behavior, sharing those will give me a better idea of what's happening.
-
Your Steps So Far: Have you tried anything on your end to fix the issue? Knowing what you've already done can help me avoid suggesting stuff you've already explored.
Thanks again for reaching out!
First of all I really like the simple programing api. I always appreciate easier apis without going into to too much details.
I am using windows 11 labtop with 32 GB RAM. Installed anaconda Python 3.11.4. I do understand that I don't have GPU as it is complaining.
I tried debugging using VS Code debugger and found out it was complaining at line 242
self.model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto", revision="main")
I am able to reproduce the same issue in isolation. So it is huggingface transformers automodel class that won't let me proceed.
I will retry it with a machine that have GPU.
Thanks!
Thank you so much for taking the time to provide the details. I'm really glad to hear that you appreciate the simplicity of Roy!
Regarding the GPU requirement issue, you're absolutely correct. The default language model for Roy, which is a GPTQ quantized wizard-coder-python-7B, does indeed require a GPU.
If you'd like to use Roy with a different language model that doesn't have such GPU requirements, you can easily make the switch by using the 'config' variable when instantiating Roy: roy = Roy(config={'generate': <LM_of_your_choice>})
.
Let me know how this works out for you, and once again, thank you for your kind comments!
Thanks!