taher

Results 51 comments of taher

potentially this should be ran on the gpu on m1 but this requires porting over cuda kernels to metal.

@mirh How would that help?

The serif font issue isn't related to this PR. I've pinged @boocmp on this.

@Witsung I get the same error when I run the migrate command. Have you gotten around this issue yet?

Do the input tokens need to go through many layers to obtain their embeddings? My guess is that the input embeddings should be obtainable earlier, so that it is not...

The input embeddings obtained at the starting layer are static and not contextual? In order for the embeddings to include context for each input token in the sequence, is it...

How are you testing the correctness of the embeddings?

I'm curious if normalizing reduces the dimensionality of the embedding space? If the goal of the consumer is to only compute cosine similarity from the embeddings, then I think it...

It would be helpful to add an example of generating input embeddings, normalizing them, and computing cosine similarity in the example folder in this PR.

If you are asking about applying rotary embeddings, then that is done in the [llama.cpp](https://github.com/ggerganov/llama.cpp/blob/master/llama.cpp#L705) file and not during conversion