fastllm issues

解决，arm64 windows下编译报错

解决arm64 windows下编译报错： ![image](https://github.com/ztxz16/fastllm/assets/25482706/f0250f92-1c3d-45d1-a489-9a0dba5321d6)

[CMakeFiles/Makefile2:100: CMakeFiles/pyfastllm.dir/all]

显卡为4090，系统Ubuntu 22.04 驱动和CUDA Driver Version: 550.54.15 CUDA Version: 12.4 python 3.10 尝试用chatglm2、chatglm3、qwen1.5均只能输出 ![image](https://github.com/ztxz16/fastllm/assets/8828385/bb7e8be8-70d4-4559-b580-1dbf2a6532d0)

VincentLore

添加add_special_tokens选项，默认true，支持chatglm模型

如题，默认为true，不影响目前chatglm的推理逻辑，为false后，将去除chatglm的special token。请帮忙review，感谢~

levinxo

chatglm3 相同提示词生成结果一致

chatglm3-6b 转化的模型，如果提示词变化不大，那么多次生成结果一致。如果我想要每次生成结果随机，是否可以配置呢？目前我运行官方提供的案例fastapiexamples/web_api.py, examples/web_api_client.py，通过配置temperature，tok，top等参数都没有效果。 ![{4E8424AE-9B03-4ccc-82E0-B45B71FFCA67}](https://github.com/ztxz16/fastllm/assets/151132985/e59dd498-789d-4e7b-a5d9-786a6d130a2b)

ttaop

Do you have a plan to implement the CudaCatOp?

dp-aixball

千问qwen1.5-14B-chat解码错误

2

【现象】 qwen1.5-14B-Chat模型在解码时报UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 1: unexpected end of data。【描述】模型输入是：假设f(x)=x，那么f(x)1到2的积分是多少。模型输出的tokenId包含11995、18137，这两个tokenId会导致上述异常。它们在词表中对应的是特殊字符，解码tokenId的方法是：model = pyfastllm.create_llm(model_path); model.weight.tokenizer.decode([11995, 18137])。另外尝试用原始模型的tokenizer解码是可以解码的，只是显示出来的是人类无法理解的字符，它不抛上述解码异常。我觉得1是要处理解码异常的问题，2是生成的tokenId应该是有问题的，即使它们被正常解码出来了，它们似乎与问题也不太相关。【flm模型转换方法】 from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained(xxx,...

yiguanxian

cmake -j报错

2

![image](https://github.com/ztxz16/fastllm/assets/106646583/b7f0c012-e8cf-4995-ab2c-0d7d90321675) ![image](https://github.com/ztxz16/fastllm/assets/106646583/57108978-6e8f-4205-b1a5-56c8125311ce)

gggdroa

无法安装fastllm_pytools

1

我满足所需条件按照步骤运行不成功，没有生成tools文件 ![443ff3404605486a32c65bc94c2bcdf](https://github.com/ztxz16/fastllm/assets/136301756/9d32559e-e099-4228-bd4a-271087031798) ![9c9ee0cf58abc1e50092760a7b7d48f](https://github.com/ztxz16/fastllm/assets/136301756/6ee47117-4a35-497f-ad7d-da2551dac954) 请问我应该如何下载

bailingchun

流式输出中断问题

``` Python #流式输出事件生成器 async def chat_stream_event_generator(request: Request, chatStream): for chunck in chatStream: try: if await request.is_disconnected(): print("连接已中断") break start_time = time.time() print(f"{start_time}->{chunck[0]}") yield chunck[0] print(f"{start_time}->写入完成") except (BrokenPipeError, ConnectionResetError) as e:...

lwinhong

fastllm
fastllm copied to clipboard

Metadata

解决，arm64 windows下编译报错

[CMakeFiles/Makefile2:100: CMakeFiles/pyfastllm.dir/all]

结果返回一直是<unk>

添加add_special_tokens选项，默认true，支持chatglm模型

chatglm3 相同提示词生成结果一致

Do you have a plan to implement the CudaCatOp?

千问qwen1.5-14B-chat解码错误

cmake -j报错

无法安装fastllm_pytools

流式输出中断问题

← Metadata

Owner

Metadata

fastllm fastllm copied to clipboard

Metadata

← Metadata

Owner

Metadata

fastllm
fastllm copied to clipboard