sasha-hailo comments

Results 18 comments of


                                            sasha-hailo

HumanEval benchmark producing unreasonably low scores

@gushiqiao , Thank you for your response. Can you please share a configuration for some model that yielded HumanEval results that were consistent with the officially published ones? I'm a...

Static Activation Quantization and Mixed-Precision Quantization Incompatibility

@aptsunny , I assume you encountered the same issue I did a few months ago. Please see the bug report I opened in https://github.com/ModelTC/llmc/issues/163. I also posted my fix proposal...

Static Activation Quantization and Mixed-Precision Quantization Incompatibility

@gushiqiao , Thank you for the update. Is my understanding correct, that this currently supports keeping selected layers in full precision (but not the originally intended granularity of supporting _any_...

BUG: Mixed-precision configuration not working with STATIC quantization

Hi @Harahan, Thank you for your response. It turns out that a lot of changes have been made since my issue report ([in this commit](https://github.com/ModelTC/llmc/commit/3a6b62ebbff4f9e4b442fa677c9af24a396ff1b6)). The functionality I was referring...

BUG: Mixed-precision configuration not working with STATIC quantization

P.S. An unrelated question: I also noticed that the commit I mentioned above added some limited support to additional quantization granularity, via functions `get_matmul_in_block()`, `get_softmax_in_block()`, `get_act_fn_in_block()`. Do you plan to...

BUG: Mixed-precision configuration not working with STATIC quantization

Did you succeed in reproducing the `mix_bits` problem I reported? I believe the issue should be reopened as a bug...

BUG: Mixed-precision configuration not working with STATIC quantization

[LLMC_RTN_W8A8_MixedA16_Bug.txt](https://github.com/user-attachments/files/17635207/LLMC_RTN_W8A8_MixedA16_Bug.txt) [LLMC_RTN_W8A8.txt](https://github.com/user-attachments/files/17635208/LLMC_RTN_W8A8.txt) I'm pretty sure this is a bug. And I now suspect that the issue affects not only RTN, but nearly any method based on *static* quantization. Can you...