MOSS
MOSS copied to clipboard
错误:Unexpected MMA layout version found
python: /project/lib/Analysis/Utility.cpp:136: bool mlir::supportMMA(mlir::Value, int): Assertion `(version == 1 || version == 2) && "Unexpected MMA layout version found"' failed.
同问 1080 Ti
这里 Titan X 也遇到这个问题。 这是否是 int8/int4 使用 triton,且 triton 目前很可能不对 Pascal 或更老型号的显卡支持 8/4 位有关? https://github.com/qwopqwop200/GPTQ-for-LLaMa/issues/142 https://github.com/openai/triton/pull/1505#issuecomment-1517484120
p40 遇见同样问题,源自 triton ,matmul_248_kernel 函数执行到 c = accumulator.to(tl.float16) 报错,可能是计算架构太老了,似乎出类似问题的都是7.0以下架构的n卡,有办法处理吗
同问,量化版本的我已经运行起来了,但提问后就会报这个错误
我买的是GPU云服务
根据昨天的更新 https://github.com/openai/triton/pull/1505/files python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16
''' if torch.version.hip is None: device = triton.runtime.jit.get_current_device() capability = triton.runtime.jit.get_device_capability(device) capability = capability[0] * 10 + capability[1] if capability < 70: assert ( not rhs.dtype.is_fp16() and not rhs.dtype.is_fp8() ), "Float8 and Float16 types are not supported for compute capability < 70 (use Float32 or above)" '''
P100 P40算力版本都是60+所以暂时只能使用Float32,但是显存又不够. 亟待解决? NVDIA V100 NVIDIA TITAN V及其以上显卡可以支持.
根据昨天的更新 https://github.com/openai/triton/pull/1505/files python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16
''' if torch.version.hip is None: device = triton.runtime.jit.get_current_device() capability = triton.runtime.jit.get_device_capability(device) capability = capability[0] * 10 + capability[1] if capability < 70: assert ( not rhs.dtype.is_fp16() and not rhs.dtype.is_fp8() ), "Float8 and Float16 types are not supported for compute capability < 70 (use Float32 or above)" '''
P100 P40算力版本都是60+所以暂时只能使用Float32,但是显存又不够. 亟待解决? NVDIA V100 NVIDIA TITAN V及其以上显卡可以支持.
你好,P100的修改成Float32,是这样改吗?
还是报错的,我显存16G,
根据昨天的更新 https://github.com/openai/triton/pull/1505/files python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16 ''' if torch.version.hip is None: device = triton.runtime.jit.get_current_device() capability = triton.runtime.jit.get_device_capability(device) capability = capability[0] * 10 + capability[1] if capability < 70: assert ( not rhs.dtype.is_fp16() and not rhs.dtype.is_fp8() ), "Float8 and Float16 types are not supported for compute capability < 70 (use Float32 or above)" ''' P100 P40算力版本都是60+所以暂时只能使用Float32,但是显存又不够. 亟待解决? NVDIA V100 NVIDIA TITAN V及其以上显卡可以支持.
你好,P100的修改成Float32,是这样改吗?
还是报错的,我显存16G,
在P100上遇到同样的问题,是不是MOSS不支持P100?
根据昨天的更新 https://github.com/openai/triton/pull/1505/files python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16 ''' if torch.version.hip is None: device = triton.runtime.jit.get_current_device() capability = triton.runtime.jit.get_device_capability(device) capability = capability[0] * 10 + capability[1] if capability < 70: assert ( not rhs.dtype.is_fp16() and not rhs.dtype.is_fp8() ), "Float8 and Float16 types are not supported for compute capability < 70 (use Float32 or above)" ''' P100 P40算力版本都是60+所以暂时只能使用Float32,但是显存又不够. 亟待解决? NVDIA V100 NVIDIA TITAN V及其以上显卡可以支持.
你好,P100的修改成Float32,是这样改吗?
还是报错的,我显存16G,
在P100上遇到同样的问题,是不是MOSS不支持P100?
今天测试:修改成float32, p100/40不是爆显存就是Unexpected MMA layout version found.
triton官网说对fp16量化模型支持不完善, p100/40等老显卡都会报如上的错. 需要等他们写入更多老显卡支持.
另外实测V100 32GB可以跑int4量化模型.
(https://github.com/OpenLMLab/MOSS/issues/%E5%8F%8CP100%E6%98%BE%E5%AD%98%E4%B8%8D%E5%A4%9F)
已发现解决方法: 根据大神最新提交https://github.com/OpenLMLab/MOSS/pull/175 需要将triton换成auto-gptq,这样就绕过了triton验证.
单卡P40(24G)测试int4量化版本成功
具体方法如下: git clone https://github.com/PanQiWei/AutoGPTQ conda create -n moss python==3.10 cd MOSS python setup_env.py --install_auto_gptq
修改MOSS\moss_cli_demo.py L31 将 ''' model = load_checkpoint_and_dispatch( raw_model, model_path, device_map="auto", no_split_module_classes=["MossBlock"], dtype=torch.float16 ) ''' 修改为 ''' model = MossForCausalLM.from_pretrained(model_path, trust_remote_code=True).half().cuda() '''
python moss_cli_demo.py