Ronsor comments

Results 30 comments of


                                            Ronsor

Raspberry Pi 4 4GB

Now that I fixed that (I'll submit a PR soon), running on an 8GB Pi results in not-terrible performance: ``` main: seed = 1678806223 llama_model_load: loading model from 'models/llama-7B/ggml-model.bin' -...

Raspberry Pi 4 4GB

@davidrutland Basically undo commit https://github.com/ggerganov/llama.cpp/commit/84d9015c4a91ab586ba65d5bd31a8482baf46ba1 and it should build fine

Differences with the llama tokenizer

https://guillaume-be.github.io/2020-05-30/sentence_piece seems to document the SentencePiece algorithm fairly well

Differences with the llama tokenizer

I think we'd have to do a backwards-incompatible file format change to support all the tokenizer's features; it also gives us a chance to do some things needed by #91,...

Differences with the llama tokenizer

> I think we'd have to do a backwards-incompatible file format change to support all the tokenizer's features; it also gives us a chance to do some things needed by...

sentencepiece bpe compatible tokenizer

@ggerganov I would suggest a version number. That allows for better error messages like `version unsupported` versus something like `invalid model file`.

How do I get input embeddings?

I know I want it. It should probably be folded into the in-progress API implementation too at #77

Can this code base be extended to support other transformer-based LLMs such as Pythia or its instruction-tuned version Open Assistant?

Are we basically making an open source ts_server (https://bellard.org/ts_server/) now? If so, I also nominate RWKV (https://github.com/BlinkDL/RWKV-LM)

Converting alpaca-native-GPTQ models into ggml models

I wrote a tool to add additional tokens to `tokenizer.model`: https://github.com/Ronsor/llama-tools The token list: ``` C [PAD] ``` would work with the script I wrote.

is this actually working?

> did anyone made this to work? > > i tested half dozen of models.. none of them actually worked. No, this code does not work.