chatglm.cpp issues

能夠增加像是llamacpp那樣action自動編譯嗎

這樣子就能直接下載使用不用再安裝環境了

CRGBS

Would you like to provide the chatglm int4 ggml model ?

2

Would you like to provide the chatglm int4 ggml model ?

hongkyunn

Allow configuring SCRATCH_SIZE (not enough space in the scratch memory pool)

After chatting for a few lines, I receive the following error: > ChatGLM2 > ggml_new_tensor_impl: not enough space in the scratch memory pool (needed 173910528, available 150994944) /home/d/bin/chat-gml2: line 4:...

kardianos

130B

1

Can this work with the GLm130B model? https://github.com/THUDM/GLM-130B

iHaagcom

GPU推理速度相比于llama.cpp慢了一倍

我的机器配置是AMD Ryzen 5950x, NVidia RTX A6000, CUDA 11.7 我目前有2套测试配置，都是截止到7月6日的最新代码，都使用同样的参数 t=6, l/n=128, prompt="how to build a house in 10 steps“ C1: chatglm2-6B, 使用 chatglm.cpp C2: vicuna_7b_v1.3, 使用 llama.cpp **在CPU下：** **FP16** -...

JianbangZ

可以允许seed吗

如题

JianbangZ

怎么样模拟steam chat和steam generate

1

现在只有在interactive model可以利用history, 怎么模拟steam_chat? https://huggingface.co/THUDM/chatglm-6b/blob/a70fe6b0a3cf1675b3aec07e3b7bb7a8ce73c6ae/modeling_chatglm.py#L1319 https://huggingface.co/THUDM/chatglm-6b/blob/a70fe6b0a3cf1675b3aec07e3b7bb7a8ce73c6ae/modeling_chatglm.py#L1293

YerongLi

能上一个docker版吗

如图

stanle1

生成速度的评估

1

感谢杰出的工作！对于下列表格的生成速度有疑问，请问这个速度(ms/token)是怎么计算出来的呢？ ![image](https://github.com/li-plus/chatglm.cpp/assets/26675984/52b5edcf-cfc5-455a-88f4-3300630e407c)

Vincent131499

使用python接口运行不能调用gpu

1

你好，运行如下命令： python cli_chat.py -m ../../pretrained-models/chatglm2-6b-ggml-q8_0.bin -i 发现并没有调用gpu。之前编译使用： cmake -B build -DGGML_CUBLAS=ON cmake --build build -j 如果使用下面命令是可以正常调用gpu: ./build/bin/main -m ../pretrained-models/chatglm2-6b-ggml-q8_0.bin -i 想问下：是这个python接口还需要适配gpu吗？若有，需要改动什么呢

Vincent131499

chatglm.cpp
chatglm.cpp copied to clipboard

Metadata

能夠增加像是llamacpp那樣action自動編譯嗎

Would you like to provide the chatglm int4 ggml model ?

Allow configuring SCRATCH_SIZE (not enough space in the scratch memory pool)

130B

GPU推理速度相比于llama.cpp慢了一倍

可以允许seed吗

怎么样模拟steam chat和steam generate

能上一个docker版吗

生成速度的评估

使用python接口运行不能调用gpu

← Metadata

Owner

Metadata

chatglm.cpp chatglm.cpp copied to clipboard

Metadata

← Metadata

Owner

Metadata

chatglm.cpp
chatglm.cpp copied to clipboard