Yiming Cui comments

Results 159 comments of


                                            Yiming Cui

llama.cpp编译部署后无法使用

https://github.com/ymcui/Chinese-LLaMA-Alpaca/issues/315

Apple Studio m1 max out of memory?

It is not solved yet. You can actively follow this PR, where the issue is being investigated: https://github.com/ggerganov/llama.cpp/pull/1826

Support ggmlv3

> LlamaChat v2 is coming with expanded support for ggml and other models. development has stalled for a bit but hopefully I'll be able to get back to it soon...

我想问一个问题，llama3还需要扩充词汇吗？

见FAQ回答：https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/faq_zh

llama3-chinese出了吗

感谢关注，预计很快将与大家分享模型。

只能用cpu跑吗？我用的gpt4all这个软件，添加模型跑的，但是提示只能用cpu跑，而跑其他模型时候，会选择默认的gpu跑

原版模型能用GPU，我们的模型也能。

text-generation-webui 的 instruction-templates 没有 Llama 3 指令模板

加载的是GGUF版吗？ Instruction template里的内容： ``` {% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '' + message['role'] + ' '+ message['content'] | trim + '' %}{%...

text-generation-webui 的 instruction-templates 没有 Llama 3 指令模板

我们没有更改instruction template，与Meta-Llama-3-8B-Instruct是一致的。加载Meta-Llama-3-8B-Instruct确实会出现无限生成的问题。这个只能等待下游这些软件适配了。目前测试正常的有：原生transformers, llama.cpp, lm studio。其余的或多或少都有点问题。

ollama 使用 ggml-model-q8_0.gguf 运行 llama3-zh-inst，有大量不相关的回复内容

1）llamacpp才改过pre-tokenizer，其他下游程序（如ollama）不一定能很快适配；2）modelfile可能要更新；我建议是再等等下游适配；另外就是可以直接用源头的llamacpp推理。

ollama 使用 ggml-model-q8_0.gguf 运行 llama3-zh-inst，有大量不相关的回复内容

刚刚试了一下原版Meta-Llama-3-8B-Instruct也是类似的问题，等下游适配吧。 llama.cpp里没有此类问题。