baize-chatbot icon indicating copy to clipboard operation
baize-chatbot copied to clipboard

run app.py error

Open XvHaidong opened this issue 2 years ago • 8 comments

Hello, when I run demo/app.py with 7B model, I got this problem 'addmm_impl_cpu_" not implemented for 'Half'. Could you please tell me how to fix it? This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces Traceback (most recent call last): File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/gradio/routes.py", line 393, in run_predict output = await app.get_blocks().process_api( File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/gradio/blocks.py", line 1069, in process_api result = await self.call_function( File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/gradio/blocks.py", line 892, in call_function prediction = await anyio.to_thread.run_sync( File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/gradio/utils.py", line 549, in async_iteration return next(iterator) File "app.py", line 43, in predict for x in greedy_search(input_ids,model,tokenizer,stop_words=["[|Human|]", "[|AI|]"],max_length=max_length_tokens,temperature=temperature,top_p=top_p): File "/media/hlt/disk/chenyang_space/chenyang_space/xhd_space/baize-main/demo/app_modules/utils.py", line 253, in greedy_search outputs = model(input_ids) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/peft/peft_model.py", line 575, in forward return self.base_model( File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 687, in forward outputs = self.model( File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 577, in forward layer_outputs = decoder_layer( File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 292, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 196, in forward query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in call_impl return forward_call(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/chenyang/anaconda3/envs/xhd/lib/python3.8/site-packages/peft/tuners/lora.py", line 406, in forward result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias) RuntimeError: "addmm_impl_cpu" not implemented for 'Half'

XvHaidong avatar Apr 05 '23 02:04 XvHaidong

Got this error too on macbook m1, please help, thanks~

hecor avatar Apr 05 '23 03:04 hecor

Fix done, please check again.

guoday avatar Apr 05 '23 04:04 guoday

Great,thanks

hecor avatar Apr 07 '23 15:04 hecor

But it was very slow to generate reply on macbook m1, nearly 1 word every 1 minute, does any parameters can change this ?

hecor avatar Apr 07 '23 15:04 hecor

You need to use GPU. It's so slow if you use CPU

guoday avatar Apr 10 '23 01:04 guoday

got it, thanks~

hecor avatar Apr 13 '23 01:04 hecor

Hi, I run demo/app.py on the remote server with 7B mode, with output in the terminal: Reloading javascript... Running on local URL: http://127.0.0.1:7860

but it can't work on local chrom using the url.

zay95 avatar Apr 18 '23 02:04 zay95

Set share=True in app.py and use public URL.

guoday avatar Apr 18 '23 07:04 guoday