compilade

https://compilade.net

Montréal, Canada

Results 109 comments of


                                            compilade

Bug: After converting the InternLM2 7b from LLamaFactory and importing it into ollama, i get an error: tensor 'token_embd.weight' has wrong shape.

> I tried to modify the `vocab_size` field in `config.json` from `92544` to `92550` I meant to set it to 92544, to match the tensor size, but from what you...

Bug: After converting the InternLM2 7b from LLamaFactory and importing it into ollama, i get an error: tensor 'token_embd.weight' has wrong shape.

@Sakura4036 Do you happen to have an `added_tokens.json` file in the same directory as the model? This seems like the only other thing than the `vocab_size` field which could affect...

Bug: After converting the InternLM2 7b from LLamaFactory and importing it into ollama, i get an error: tensor 'token_embd.weight' has wrong shape.

> Yes, an `add_tokens.json` file does exist in the exported model folder. Should I delete it? Yes you can delete it (or you can rename the file to something else)....

[feature request] conversion to gguf in a more pure form.

> I have noticed that convert does not produce a "pure" f16. Do you mean that some tensors are in `F32` in the resulting `gguf` model? These are usually 1D...

Refactor: convert_hf_to_gguf.py

(note for later) This will (trivially) conflict with at least - #17069 - #15667 - #15727 - (non-existent yet, but wip) convert : generalized repacking for pre-quantized models

Random seed possible problems.

> I see 2 possibilities: > > 1. when not specified, the seed is shown "wrong" > 2. when entered manually the seed is interpreted differently. This is weird because...

Random seed possible problems.

AHA! The sampling seed in `params.sparams.seed` is set by `--seed`, but not when choosing a default seed in `main.cpp`. This seems to fix it: ```diff diff --git a/examples/main/main.cpp b/examples/main/main.cpp index...

Random seed possible problems.

> so what was the seed when not specified? 0? When not specified, the sampling seed is random. https://github.com/ggerganov/llama.cpp/blob/22f281aa16f44d8f6ec2c180a0685ff27e04e714/common/sampling.cpp#L82

Random seed possible problems.

> I tried to figure out why using >1 slot does not produce deterministic results when doing parallel requests. Do you know why it is not possible to get deterministic...

ggml: avoid rebuild of GGML graph for each token (#7456)

> Do you think using `inp_pos` to calculate offset makes sense? Not all models use `inp_pos` (e.g. recurrent models don't). Also, the `head` of the self-attention unified KV cache doesn't...

‹
1
2
3
4
5
6
7
8
9
10
11
›