MOSS icon indicating copy to clipboard operation
MOSS copied to clipboard

NameError: name 'transpose_matmul_248_kernel' is not defined

Open ZealHua opened this issue 1 year ago • 3 comments

代码中缺少了对transpose_matmul_248_kernel的定义。 代码中使用的库存在问题,这个错误出现在quantization.py文件中,这个文件似乎是Hugging Face模型缓存中的一部分。 quantization.py", line 265, 应该是忘记导入 transpose_matmul_248_kernel 了,我找了一个,文件头部没有导入。 这个代码是deepspeed 微调时触发的, 走到model_engine.backward(loss) 时出的问题 """ File "fine_tune.py", line 117, in model_engine.backward(loss) File "/home/zeal/pytorch-venv/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, **kwargs) File "/home/zeal/pytorch-venv/lib/python3.8/site-packages/deepspeed/runtime/engine.py", line 1796, in backward self.optimizer.backward(loss, retain_graph=retain_graph) File "/home/zeal/pytorch-venv/lib/python3.8/site-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn ret_val = func(*args, **kwargs) File "/home/zeal/pytorch-venv/lib/python3.8/site-packages/deepspeed/runtime/zero/stage3.py", line 1923, in backward self.loss_scaler.backward(loss.float(), retain_graph=retain_graph) File "/home/zeal/pytorch-venv/lib/python3.8/site-packages/deepspeed/runtime/fp16/loss_scaler.py", line 62, in backward scaled_loss.backward(retain_graph=retain_graph) File "/home/zeal/pytorch-venv/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward torch.autograd.backward( File "/home/zeal/pytorch-venv/lib/python3.8/site-packages/torch/autograd/init.py", line 200, in backward Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass File "/home/zeal/pytorch-venv/lib/python3.8/site-packages/torch/autograd/function.py", line 274, in apply return user_fn(self, *args) File "/home/zeal/pytorch-venv/lib/python3.8/site-packages/torch/cuda/amp/autocast_mode.py", line 123, in decorate_bwd return bwd(*args, **kwargs) File "/home/zeal/.cache/huggingface/modules/transformers_modules/fnlp/moss-moon-003-sft-plugin-int4/353c499f7415575ba217704f3f28a1e817eb7487/quantization.py", line 292, in backward grad_input = transpose_matmul248(grad_output, qweight, scales, qzeros, g_idx, bits, maxq) File "/home/zeal/.cache/huggingface/modules/transformers_modules/fnlp/moss-moon-003-sft-plugin-int4/353c499f7415575ba217704f3f28a1e817eb7487/quantization.py", line 265, in transpose_matmul248 transpose_matmul_248_kernel[grid](input, qweight, output, NameError: name 'transpose_matmul_248_kernel' is not defined

ZealHua avatar May 01 '23 04:05 ZealHua

请问您解决这个问题了吗 我也遇到一样的问题

lby129 avatar May 08 '23 10:05 lby129

同志们,我估计是这里写错了,transpose_matmul_248_kernel指的应该是168行定义的trans_matmul_248_kernel,修改一下函数名就可以 继续处理下一个bug😊

lipengyuer avatar May 10 '23 10:05 lipengyuer

@lipengyuer 不行啊我改了 .cache里面的文件 和 models里面 对应的quantization.py文件都改了,但是重新跑 脚本.cache的那个文件又变回去了

631068264 avatar May 29 '23 02:05 631068264