Starrick Liu

Results 10 comments of Starrick Liu

> There will be in just a few days :-) Alright, I'm looking forward to it.

> There will be in just a few days :-) Hello! I've found that Cutlass 3.5.0 has been released. Where can I access examples related to GEMV in a CUDA...

> Hello, I see it. I'm trying to modify, in order to remove this constraint. But it seems to have problems with precision. Does anyone know the specific reason for...

> > > > Hello, I see it. I'm trying to modify, in order to remove this constraint. But it seems to have problems with precision. Does anyone know the...

> Hi @StarWorkXc, I also tested with prompts length >7K, with similar modifications in my [repo](https://github.com/zhen-jia/FasterTransformer). The results are reasonable. I am thinking to submit a PR on that. What...

> > 我觉得可以,你提交就好。我还在尝试让它支持16k,但是比较麻烦。 > > @StarWorkXc 有啥好办法扩招到16k 或者更高么? 限制主要在: src/fastertransformer/kernels/decoder_masked_multihead_attention/decoder_masked_multihead_attention_template.hpp 如果GPU架构版本够高,就修改一下Kernel参数,让kernel支持当前架构的最大share memory 如果GPU不新,share memory最大是48KB,就尝试下面两个方向的修改: 1、参考Flash Attention,修改decoder_masked_multihead_attention_template的核心逻辑 2、把Attention Score存在Share Memory中的数据类型从fp32改为fp16 第二点比较简单,目前我的修改版本在96KB的ShareMemory的GPU上,最高支持32K的上下文。

+1 楼主解决了吗?

解决了,注意readme文档,需要使用 pip install -e ./ 来安装 在使用pip安装Python包时,可以使用-e选项来安装一个软件包作为可编辑的模式。这通常用于开发环境中,当您希望修改Python包的源代码时,可以使用此选项。 使用-e选项安装Python包时,pip会创建一个符号链接(或者是Windows上的Junction)指向您本地文件系统中的软件包目录,而不是复制软件包文件到Python解释器的site-packages目录中。 这意味着您对软件包所做的任何更改都将立即反映在您的Python安装中。 例如,下面的命令安装了名为mypackage的Python包,并将其安装为可编辑模式: pip install -e /path/to/mypackage 这将在/path/to/mypackage目录中创建一个符号链接,并使您能够在其中进行编辑,并使更改立即生效。