Lun Zhongwang

[email protected]

404 Peking

Results 10 comments of


                                            Lun Zhongwang

chatglm2运行llm.from_hf报错Error: cublas error

@ztxz16 你好，请问这个已经解决了吗？我的16G显存，int8，输入长度2048，输出100，batch_size设置3就oom了。

python batch推理接口是不是还没实现

> 嗯，之后会和generationConfig一起更新下现在可以多线程调用stream_response或者stream_chat，内部会自动拼batch （目前只有fp16拼batch有收益） @ztxz16 你好，现在更新了吗？是batch推理在pyfastllm里面了吗？谢谢～

what is the meaning of the following code in''MPViT-main\semantic_segmentation\configs\_base_\models''

Transformer中的分解自注意力的计算那块。另外有没有人使用多卡环境跑的时候，每个两个验证报torch.distributed.elastic.multiprocessing.api:failed（exitcode:-9）然后程序终止的呀？

About the parameters of drop_path_rate

代码中好像是没用drop_path，概率设置为0了我记得

[Feature] Support for SelfExtend-style context expansion

As the paper mentioned, self-Extend do not support flash-attn.

the gpu memory usage of finetuning dualstylegan on 8 gpus

Thanks for your reply, I'll try it later.

Code for Det/Seg

Except your code for seg based ADE20K dataset etc.

R.I.P.

R.I.P

Error: cublas error.terminate called after throwing an instance of 'char const*'

@lxp521125 是OOM了吧？

'from fastllm_pytools import llm' 方式加载模型后怎么做batch推理？llm.py里没有batch推理函数

@q497629642 是不是得用pyfastllm？你的解决了吗