Kerfuffle comments

Results 159 comments of


                                            Kerfuffle

Parallel loading of the model tensors

Interesting. Weirdly enough, that actually only has limited support for non-streams (i.e. `mmap`). I don't know if it would be necessary to use the seek features for handling the GGML...

Embedding extraction

@hlhr202 The CLI is just a consumer of the library crate, so when using the library you'll be able to get the embeddings.

Embedding extraction

Since I added the `--dump-prompt-tokens` option, you can probably guess I like exposing information. :) I know people asked about being able to show the embeddings with llama.cli, so it...

Embedding extraction

Is it a lot of data? You could probably just print in the normal Rust debug format which should look like a comma separated list if it's in a `Vec`...

Embedding extraction

Ahh, then seems like it probably isn't worth even bothering to add to the CLI right now unless someone comes here and requests it. Or they could probably just write...

Strip off last \r\n in the prompt file.

@philpax Because those would strip off _all_ the trailing newline/carriage return characters, making it 1) impossible to actually use a prompt from a file that ends with a newline and...

Strip off last \r\n in the prompt file.

By the way, I hope that comment wasn't too blunt. With the numbered items it does kind of read like I was lecturing you or something. So it may have...

Copy v_transposed like llama.cpp

@setzer22 See the comments and referenced issues in the llama.cpp pull: https://github.com/ggerganov/llama.cpp/pull/439 I basically just made the same change, I can't say I understand it or anything. According to the...

Copy v_transposed like llama.cpp

Actually, I'm blind, llama-rs does have `--seed`. You can test it out yourself: Set the seed to something, generate some tokens, restart with a different thread or batch size and...

Copy v_transposed like llama.cpp

Sure that's certainly no problem. Or `increase_determinism`. (As far as I could tell, it was deterministic with a seed specified from the testing I did. There might be edge cases...