sasha-hailo

Results 18 comments of sasha-hailo

@gushiqiao , Thank you for your response. Can you please share a configuration for some model that yielded HumanEval results that were consistent with the officially published ones? I'm a...

@aptsunny , I assume you encountered the same issue I did a few months ago. Please see the bug report I opened in https://github.com/ModelTC/llmc/issues/163. I also posted my fix proposal...

@gushiqiao , Thank you for the update. Is my understanding correct, that this currently supports keeping selected layers in full precision (but not the originally intended granularity of supporting _any_...

Hi @Harahan, Thank you for your response. It turns out that a lot of changes have been made since my issue report ([in this commit](https://github.com/ModelTC/llmc/commit/3a6b62ebbff4f9e4b442fa677c9af24a396ff1b6)). The functionality I was referring...

P.S. An unrelated question: I also noticed that the commit I mentioned above added some limited support to additional quantization granularity, via functions `get_matmul_in_block()`, `get_softmax_in_block()`, `get_act_fn_in_block()`. Do you plan to...

Did you succeed in reproducing the `mix_bits` problem I reported? I believe the issue should be reopened as a bug...

[LLMC_RTN_W8A8_MixedA16_Bug.txt](https://github.com/user-attachments/files/17635207/LLMC_RTN_W8A8_MixedA16_Bug.txt) [LLMC_RTN_W8A8.txt](https://github.com/user-attachments/files/17635208/LLMC_RTN_W8A8.txt) I'm pretty sure this is a bug. And I now suspect that the issue affects not only RTN, but nearly any method based on *static* quantization. Can you...

To the best of my understanding, If the quantization configuration is the same for all layers of the model, the bug does not apply.

Hi @Harahan , LLMC folks, I wanted to let you know that I have solved the bug in my side branch. In addition, I also added support of separate configurable...

@AaronMaYue , I apologize for the late response. If it's still relevant, please let me know and I'll clean up my code for sharing.