llama2.c issues

[Suggestion] Enable Discussion

1

Dear Andrej Could you enable discussions for this repo. That would help folks to ask questions in discussions instead of issues. For example I have a few questions regarding running...

trholding

Initialize Tokenizer and simplify str_lookup prototype

- new method to initialize tokenizer with a given vocab_size - removed voacb_size from the arguments of build_tokenizer - applied the changes in run.c, runq.c, test.c - pass the tokenizer...

pagakarthik

Add a dependenices check

Some dumb people like me might misread the code and not see we need to pip install the requirements.txt, this will remind the user

BarrettBytes

fixed dimension comments in RunState

rupnikj

Could anyone port deepseek-moe to llama2.c?

It employs an innovative MoE architecture, which involves two principal strategies: fine-grained expert segmentation and shared experts isolation. https://github.com/deepseek-ai/DeepSeek-MoE/tree/main https://huggingface.co/deepseek-ai/deepseek-moe-16b-chat

win10ogod

I added bidirectional attention, and those who need it can study it.

5

Modify the original attention ``` class Attention(nn.Module): def __init__(self, args: ModelArgs): super().__init__() self.n_kv_heads = args.n_heads if args.n_kv_heads is None else args.n_kv_heads assert args.n_heads % self.n_kv_heads == 0 model_parallel_size = 1...

win10ogod

fix allocation of scaling factors

2

allocate only one scaling factor per group

moritztng

New Visual Walkthrough of Llama2.c

@karpathy - thank you for the great software. I wrote up a visual walk-through of how it all works in detail. I think I got it all right and am...

ZoroDerVonCodier

Mobile React native Support Ported

Here is the port of llama2.c to pure JavaScript for React native (mobile). [https://github.com/hootan09/llamajs_rn](https://github.com/hootan09/llamajs_rn) ![android](https://github.com/karpathy/llama2.c/assets/9764855/8b9c1f43-5a69-46ba-bb37-c61528a47b7f)

hootan09

Understanding "multiple_of"

Hi everyone, I am trying to understand the usage of the "multiple_of" parameter. I understand the purpose of the parameter. However, the code is not doing what it is supposed...

akbayt

llama2.c
llama2.c copied to clipboard

Metadata

[Suggestion] Enable Discussion

Initialize Tokenizer and simplify str_lookup prototype

Add a dependenices check

fixed dimension comments in RunState

Could anyone port deepseek-moe to llama2.c?

I added bidirectional attention, and those who need it can study it.

fix allocation of scaling factors

New Visual Walkthrough of Llama2.c

Mobile React native Support Ported

Understanding "multiple_of"

← Metadata

Owner

Metadata

llama2.c llama2.c copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama2.c
llama2.c copied to clipboard