massive-activations icon indicating copy to clipboard operation
massive-activations copied to clipboard

Code accompanying the paper "Massive Activations in Large Language Models"

Results 6 massive-activations issues
Sort by recently updated
recently updated
newest added

Hi ! Interesting work on the role of explicit bias! I was wondering what training settings got you an eval PPL ~3.04. The paper mentions that 50K iterations are required...

interesting work! i have a question as it in the title, do you conducte an experiment like that? what's the result? thanks.

1. How to get the mean value of massive activation?e.g. 2546.8/-1502.0 in hook.py 2. Mean value is still large, what is the difference between using the mean value and using...

Hello, This is great work! And I wonder about the layer that the analyzed activations are from. The last layer?

Hello, I am interested in the standard deviation of the activation and would like to know how the variance is calculated. Here are a few methods: 1. Calculate the variance...