ponytaill
ponytaill
> I get this, weight is fake int4, in calculation, actually is int16 If it's convenient for you, could you explain it?
> It seems that this level of hardware configuration is not yet able to drive models of 8B and above. Currently, I have found that only models around 1.5B can...
I used Android Studio tools for memory profiling, and got the result. It seems to be correct. The left and right peaks represent two models of different sizes 