llCurious

Results 28 comments of llCurious

@YixinSong-e . Same question here. I notice that PowerInfer adopts a different sparsity strategy, i.e., considering the power la, while DeJa Vu seems to use threshold of 0 for sparsity....

Hi @YixinSong-e . I notice that you provide the ReLU-LLaMA in HF. I run the model and found that the sparsity (values lower than zero) is much lower than OPT...

Thanks for your reply. I have a question that in my understanding, ReGLU uses element-wise multiplication, which means those zero values after ReLU remain zero, theoretically yilelding same sparsity level...

hi @hodlen . I notice that you provide ReLUFalcon-40B in the HF. Do you have the tuned ReLU-Falcon-7B weights?

Thanks, i got it. BTW, do you have the README on using BigDL optimized int4/int8 quantized computation library? Maybe the use case of quantized matmul in Linear layers in models...

Any progress? @qihqi @ManfeiBai I also encountered the same issue.

Any progress to fix this issue? @Guangxuan-Xiao

Hey, @jlwatson. Sry to bother you. But do you have any ideas?

Thanks to you quick reply!! I just run the model of lenet-norelu. The accuracy results seems to work fine. BTW, do you mean for other networks like secureml, minionn and...

hi, @warpoons . Interesting idea. As pointed out by @fionser , the problem is due to the **increasing amont of truncations**. According to the Winograd algorithm, the matmul is separated...