fastllm issues

'from fastllm_pytools import llm' 方式加载模型后怎么做batch推理？llm.py里没有batch推理函数

1

加速chatglm2感觉没效果，和pytho直接调用都差不多是 30ms/token

1

python代码直接加载模型调用 ![image](https://github.com/ztxz16/fastllm/assets/35361034/e1ffbd23-2f2a-4f5d-8f82-e75c391feb36) Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Loading checkpoint shards:...

17714196157

chatglm2 跑完以后释放内存报错： CUDA error when release memory

2

跑完query以后报错 CUDA error when release memory，求助 Error: CUDA error when release memory! CUDA error = 4, cudaErrorCudartUnloading at fastllm/src/devices/cuda/fastllm-cuda.cu:1493 'driver shutting down'

xiaoduozhou

Qwen-7b-Chat重复输出

9

和之前很多issue一样的问题 Qwen-7B-Chat，fastllm加速，无论是fp16还是int8，都有prompt会出现重复输出停不下来的情况，不加速是正常的我的环境 A6000，torch2.0.1，cuda11.8，最新的fastllm代码停不下来的prompt：如何使用python的selenium将网页保存为pdf ![image](https://github.com/ztxz16/fastllm/assets/59313130/f7610440-c3dc-4c10-8759-3e63f7489b66)

rufeng-h

mac上编译完缺少libfastllm_tools.so

5

M2处理器是不支持吗

Lowpower

ModuleNotFoundError: No module named 'pyfastllm'

3

pyfastllm下readme里的 cd pyfastllm python build_libs --cuda python cli.py -p chatglm-6b-int8.bin -t 8 无法执行使用install.sh中的脚本或直接python setup.py也无法安装，

Cloopen-ReLiNK

加速看不到效果反而更慢

11

尝试了chatglm 和baichuan 使用fastllm后速度反而更慢

GUORUIWANG

chatglm2微调后的模型能加速并且部署吗

2

chatglm2微调后的模型能加速并且部署吗

sssssshf

Request for Support for Ascend Series Graphics Cards

1

Hello, I hope this message finds you well. I am writing to kindly request your support for the Ascend series of graphics cards in your project. As you may be...

junior-zsy

pyfastllm多线程

2

使用cli_thread.py代码，输入问题后会报错：Segmentation fault (core dumped) 将response中的prompt_input经过makeInput处理后仍然不行。已经尝试使用ChatGLM2-6b, 百川以及Alpaca13B

sym19991125

fastllm
fastllm copied to clipboard

Metadata

'from fastllm_pytools import llm' 方式加载模型后怎么做batch推理？llm.py里没有batch推理函数

加速chatglm2感觉没效果，和pytho直接调用都差不多是 30ms/token

chatglm2 跑完以后释放内存报错： CUDA error when release memory

Qwen-7b-Chat重复输出

mac上编译完缺少libfastllm_tools.so

ModuleNotFoundError: No module named 'pyfastllm'

加速看不到效果反而更慢

chatglm2微调后的模型能加速并且部署吗

Request for Support for Ascend Series Graphics Cards

pyfastllm多线程

← Metadata

Owner

Metadata

fastllm fastllm copied to clipboard

Metadata

← Metadata

Owner

Metadata

fastllm
fastllm copied to clipboard