chatglm.cpp issues

web_demo.py bug fixed

4

Now web_demo.py won't display HTML tags on gradio chatbot.

emikeliu

Android build

Does this support Android build yet with NDK?

JianbangZ

windows cmake 失败

4

E:\github_code\chatglm.cpp\chatglm.cpp(14,10): fatal error C1083: 无法打开包括文件: “sys/mman.h”: No such file or directory [E:\github_code\chatglm.cpp\build\chatglm.vcxproj]

matr1xes

porting to Windows

1. `getopt` is supported by a submodule; 1. mapped file ported to Windows (like `llama.cpp`) 1. UTF-8 input on Windows (like `llama.cpp`)

foldl

非常感谢你们的工作，当前版本似乎不支持单机多卡推理。在 8 x RTX 2080 TI 11G 的机子上加载未量化的chatglm2-6b-f16权重，ggml可以检测到8块显卡，但是所有权重都被扔到第一块卡上，导致cuda out of memory。同样的main运行q_4的权重没有任何问题。系统：Ubuntu20.04 bash运行 `$echo CUDA_VISIBLE_DEVICE` 返回 `0,1,2,3,4,5,6,7` 期待修改以支持多卡推理，谢谢！

grizxlyzx

试了一下量化8位和4位没有明显感觉到速度差距

1

13900KF 精度不太好说，没有测试

liaoweiguo

Python Module 调用失败

5

# Macbook M1 pro 16G ## 编译成功 ```shell cmake -B build 成功 cmake --build build -j 成功 python convert.py -i ~/models/THUDM/chatglm2-6b -t f16 -o chatglm2-ggml-f16.bin 成功 Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████|...

suifei

可不可以试试visualglm?

1

感觉这个对visualglm 这种吃内存显存的更有用

billzhao9

Adding profiling info for chatglm.cpp

1

Hi, First of all, thanks for creating interesting project for ChatGLM. I wonder if you plan to add performance profiling information similar to [llama_print_timings](https://github.com/ggerganov/llama.cpp/blob/b8c8dda75fdf5fdea49c80af36818e7c30fe0ddf/llama.cpp#L3432C6-L3432C25) in llama.cpp

sammysun0711

请问此项目有可能支持昇腾硬件么?

如题

BrightXiaoHan

chatglm.cpp
chatglm.cpp copied to clipboard

Metadata

web_demo.py bug fixed

Android build

windows cmake 失败

porting to Windows

请求增加多gpu推理

试了一下量化8位和4位没有明显感觉到速度差距

Python Module 调用失败

可不可以试试visualglm?

Adding profiling info for chatglm.cpp

请问此项目有可能支持昇腾硬件么?

← Metadata

Owner

Metadata

chatglm.cpp chatglm.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

chatglm.cpp
chatglm.cpp copied to clipboard