Nikhil Gupta

Results 28 comments of Nikhil Gupta

I cross Checked the **llama2 7b** model , I was expecting that size of the blocks should be same as each block is exactly the same model with same dimension...

Okay so Type 1 and Type 2 Quantization for Convolutions are based on Sparse vs Compressed blob size for quantized weights and I am guessing MNN has both the implementations...

I see that you have added the support for TinyLlama conversion. This is exactly the way I did and got it working. My multilingual model is trained right now. I...

> > I see that you have added the support for TinyLlama conversion. This is exactly the way I did and got it working. My multilingual model is trained right...

Understood, Thanks. I am happy that this bug was caught and fix is in progress. _I am just guesing that size of the data structure storing the embedding weight is...

@wangzhaode I tried finding the overflow in MNN model export as per your advise. I was not able to locate an overflow as vocab_size * hidden_size * sizeof(float) < INT_MAX...

@wangzhaode any help or guidance regarding this? > @wangzhaode I tried finding the overflow in MNN model export as per your advise. I was not able to locate an overflow...

I printed the all input to my ArgMax layer for lm head. The maximum value is always at index "1" outof 160984 vocab size. This is very weird and I...