Erik Scholz comments

Results 282 comments of


                                            Erik Scholz

Scale buf_size linearly with n_ctx

@slaren earlier i wrote: > We should instead "determine the required inference memory per token" https://github.com/ggerganov/llama.cpp/blob/master/main.cpp#L891 , so it can increase the size by itself dynamically https://github.com/ggerganov/llama.cpp/blob/master/main.cpp#L557 > edit: I...

Scale buf_size linearly with n_ctx

please try https://github.com/ggerganov/llama.cpp/pull/438 and see if it fixes it. i implemented the observations made in this thread.

Stop keywords

@Bec-k can you elaborate on what you think is not implemented?

llama : add RWKV models support

closing this in favor of https://github.com/ggerganov/ggml/issues/21 also https://github.com/saharNooby/rwkv.cpp seems to be it.

Feature Request: usage in the shell

@DerekFroese not sure if you are still looking for that, but a shell application with similar features does exist https://en.wikipedia.org/wiki/MTR_(software)

Add proper instructions for using Alpaca models

check sum for the converted (ggmf v1) Pi3141 alpaca-30B-ggml ``` $ sha256sum ggml-model-q4_0.bin 969652d32ce186ca3c93217ece8311ebe81f15939aa66a6fe162a08dd893faf8 ggml-model-q4_0.bin ```

Add proper instructions for using Alpaca models

@anzz1 you did not specify for which model your links are. also please provide checksums :)

Add proper instructions for using Alpaca models

me: i should try and debug all those crashes me: `> help me write a song about llama.cpp (c++ api for facebooks llm)` llama.cpp: ``` A llama is an animal...

Add proper instructions for using Alpaca models

the ones you linked are sadly mixed, and not "pure" lora models. so i would assume no. you could just say "pi3141 alpaca 30B" model, and it would be fine...

Add proper instructions for using Alpaca models

"mixed" -> "merged" If you look at this for example https://huggingface.co/tloen/alpaca-lora-7b/tree/main , those are **only** the lora weights. I **think** (need to actually read the paper) those are either not...