Lambert Sun

Results 3 comments of Lambert Sun

根据昨天的更新 [https://github.com/openai/triton/pull/1505/files](https://github.com/openai/triton/pull/1505/files) python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16 ''' if torch.version.hip is None: device = triton.runtime.jit.get_current_device() capability = triton.runtime.jit.get_device_capability(device) capability = capability[0] * 10 + capability[1] if capability < 70: assert ( not...

> > > 根据昨天的更新 https://github.com/openai/triton/pull/1505/files python/triton/language/semantic.py中提到算力小于70的显卡都不支持Float8 and Float16 > > > ''' if torch.version.hip is None: device = triton.runtime.jit.get_current_device() capability = triton.runtime.jit.get_device_capability(device) capability = capability[0] * 10 + capability[1] if...

已发现解决方法: 根据大神最新提交[https://github.com/OpenLMLab/MOSS/pull/175](https://github.com/OpenLMLab/MOSS/pull/175) 需要将triton换成auto-gptq,这样就绕过了triton验证. **单卡P40(24G)测试int4量化版本成功** 具体方法如下: git clone https://github.com/PanQiWei/AutoGPTQ conda create -n moss python==3.10 cd MOSS python setup_env.py --install_auto_gptq 修改MOSS\moss_cli_demo.py L31 将 ''' model = load_checkpoint_and_dispatch( raw_model, model_path, device_map="auto", no_split_module_classes=["MossBlock"], dtype=torch.float16...