chatglm.cpp
chatglm.cpp copied to clipboard
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
没有办法并行调用,所以起openai api的意义何在?
chatglm_cpp.openai_api:app 支持千问 (qwen)转的bin模型的api么
After installing all libs and downloading model, I trid to quantize it to q4_0, but can not call tool. Another drawback is model seems output repeated content. Could u help...
``` 15.62 Building wheels for collected packages: chatglm-cpp 15.62 Building wheel for chatglm-cpp (pyproject.toml): started 16.21 Building wheel for chatglm-cpp (pyproject.toml): finished with status 'error' 16.22 error: subprocess-exited-with-error 16.22 16.22...
有个疑问,为什么在读取历史消息的时候要去掉system的,这样似乎无法识别到传入的system prompt? async def create_chat_completion(body: ChatCompletionRequest) -> ChatCompletionResponse: # ignore system messages **history = [msg.content for msg in body.messages if msg.role != "system"]** if len(history) % 2 != 1: raise...
  编译的时候是可以找到GPU的,运行的时候报错找不到了,是什么原因?
命令:python3 chatglm_cpp/convert.py -i baichuan-inc/Baichuan2-13B-Chat -t q4_0 -o baichuan-ggml.bin 环境: # packages in environment at /opt/homebrew/Caskroom/miniconda/base/envs/pytorch: # # Name Version Build Channel accelerate 0.24.1 pyhd8ed1ab_0 conda-forge aiofiles 22.1.0 py312hca03da5_0 aiohttp 3.9.0...
Add `CMake` flag in `CMakeLists.txt` refer to [llama.cpp](https://github.com/ggerganov/llama.cpp) Compile with args: ```sh cmake -B build -DGGML_HIPBLAS=ON -DCMAKE_C_COMPILER=/opt/rocm/llvm/bin/clang -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ && cmake --build build -j ```
同title,能不能给个demo