klosax comments

Results 37 comments of


                                            klosax

[RFC] Implement a mechanism to detect the type of model being read

Why not go even further? Make the common infrastructure of llama.cpp become something like "ggml-llm" and the code for the specific llm architectures (llama, gpt-2, gpt-j, mpt and others) become...

Improve support for special tokens

The gguf [gpt2 tokenizer](https://github.com/ggerganov/llama.cpp/blob/gguf/cmpnct_gpt2bpe.hpp) also have a Trie implementation. The tokenizer is on MIT license. Maybe it could be reused for the llama tokenizer.

Improve support for special tokens

The author of the gpt2 tokenizer gave permission to use it and stated it is on MIT license here https://github.com/ggerganov/llama.cpp/pull/2398#issuecomment-1667009979

Improve support for special tokens

It looks like the MIT and Apache licenses are compatible, but a copy of the Apache license and a Notice file must be included: https://softwareengineering.stackexchange.com/questions/51987/how-to-include-an-apache-library-with-my-opensource-code#52223

GGUF file format specification

What is the difference between `max_seq_len` and `context_length`? Isn't both the maximum usable/recommended context length?

GGUF file format specification

I suggest use of special key-values to identify special tokens: `tokenizer.bos_token_id` Beginning of sequence marker `tokenizer.eos_token_id` End of sequence marker `tokenizer.unk_token_id` Unknown token `tokenizer.sep_token_id` Separator token `tokenizer.pad_token_id` Padding token

GGUF file format specification

Why not use a less cryptic key naming? `[llm].hidden_size --> [llm].embedding_length` `[llm].n_ff --> [llm].feedforward_length` `[llm].n_layers --> [llm].num_layers` `[llm].attention.n_heads --> [llm].attention.num_heads` `[llm].rope.n_dims --> [llm].rope.num_dims` or even better change `n_` and `num_`...

GGUF file format specification

I tend to prefer `_count` instead of `num_` as in `gguf_header_t`: ``` uint32_t tensor_count; uint32_t metadata_kv_count; ``` ``` gguf_tensor_info_t: uint32_t n_dimensions; --> uint32 dimension_count; uint32_t dimensions[n_dimensions]; --> uint32 dimensions[dimension_count]; uint32_t...

GGUF file format specification

More descriptive: `[llm].rope.scale` --> `[llm].rope.context_scale`

GGUF file format specification

> Luckily, @klosax already [did these for v1 of the spec](https://github.com/klosax/ggml/tree/gguf/examples/gguf)! Hopefully, we can just update this code and we should be good to go. I think this should be...