MOSS icon indicating copy to clipboard operation
MOSS copied to clipboard

经过几番捣腾,后台报语法错误:TypeError: '<' not supported between instances of 'tuple' and 'float'`

Open ImGoodBai opened this issue 1 year ago • 8 comments

经过几番捣腾,载入int4的模型OK了,浏览器提交prompt,后台报语法错误如下。 ubuntu:2204 NVIDIA-SMI 530.41.03

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243

`~/MOSS$ python moss_gui_demo.py Waiting for all devices to be ready, it may take a few minutes...

Running on local URL: http://0.0.0.0:6006 Running on public URL: https://7b29d06f6fba682b.gradio.live This share link expires in 72 hours. For free permanent hosting and GPU upgrades (NEW!), check out Spaces: https://huggingface.co/spaces

Traceback (most recent call last): File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/gradio/routes.py", line 401, in run_predict output = await app.get_blocks().process_api( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/gradio/blocks.py", line 1302, in process_api result = await self.call_function( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/gradio/blocks.py", line 1025, in call_function prediction = await anyio.to_thread.run_sync( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run result = context.run(func, *args) File "moss_gui_demo.py", line 122, in predict outputs = model.generate( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/transformers/generation/utils.py", line 1571, in generate return self.sample( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/transformers/generation/utils.py", line 2534, in sample outputs = self( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/good/MOSS/models/modeling_moss.py", line 678, in forward transformer_outputs = self.transformer( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/good/MOSS/models/modeling_moss.py", line 545, in forward outputs = block( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/good/MOSS/models/modeling_moss.py", line 270, in forward attn_outputs = self.attn( File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/good/MOSS/models/modeling_moss.py", line 164, in forward qkv = self.qkv_proj(hidden_states) File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, **kwargs) File "/home/good/MOSS/models/quantization.py", line 371, in forward out = QuantLinearFunction.apply(x.reshape(-1, x.shape[-1]), self.qweight, self.scales, File "/home/good/anaconda3/envs/moss/lib/python3.8/site-packages/torch/cuda/amp/autocast_mode.py", line 94, in decorate_fwd return fwd(*args, **kwargs) File "/home/good/MOSS/models/quantization.py", line 283, in forward output = matmul248(input, qweight, scales, qzeros, g_idx, bits, maxq) File "/home/good/MOSS/models/quantization.py", line 254, in matmul248 matmul_248_kernel[grid](input, qweight, output, File "/home/good/MOSS/models/custom_autotune.py", line 93, in run self.cache[key] = builtins.min(timings, key=timings.get) TypeError: '<' not supported between instances of 'tuple' and 'float'`

ImGoodBai avatar Apr 26 '23 03:04 ImGoodBai

same issue.

laoshancun avatar Apr 26 '23 03:04 laoshancun

一样的错误。 image

wuxianghou avatar Apr 26 '23 03:04 wuxianghou

请查看这个issue https://github.com/OpenLMLab/MOSS/issues/65

Hzfinfdu avatar Apr 26 '23 05:04 Hzfinfdu

参考:ssue https://github.com/OpenLMLab/MOSS/issues/65 注释掉 models/custom_autotune.py 后依然报下面错误:

   except #triton.compiler.OutOfResources:
   return float('inf')

$ python3 moss_cli_demo.py /usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.15) or chardet (3.0.4) doesn't match a supported version! warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported " Waiting for all devices to be ready, it may take a few minutes... triton not installed. Run pip install triton to load quantized version of MOSS. Traceback (most recent call last): File "moss_cli_demo.py", line 30, in raw_model = MossForCausalLM._from_config(config, torch_dtype=torch.float16) File "/home/good/.local/lib/python3.8/site-packages/transformers/modeling_utils.py", line 1024, in _from_config model = cls(config, **kwargs) File "/home/good/MOSS/models/modeling_moss.py", line 612, in init self.quantize(config.wbits, config.groupsize) File "/home/good/MOSS/models/modeling_moss.py", line 736, in quantize from .quantization import quantize_with_gptq File "/home/good/MOSS/models/quantization.py", line 27, in @autotune( NameError: name 'autotune' is not defined

ImGoodBai avatar Apr 26 '23 06:04 ImGoodBai

已更新

Hzfinfdu avatar Apr 26 '23 06:04 Hzfinfdu

已更新

还是有一样的问题

66li avatar May 05 '23 08:05 66li

我也有一样的问题

ArlanCooper avatar May 05 '23 08:05 ArlanCooper

我也有一样的问题

https://github.com/OpenLMLab/MOSS/issues/129#issuecomment-1535899953 我通过这个大佬的解决方案搞好了。

66li avatar May 05 '23 08:05 66li