Andy0422 comments

Results 14 comments of


                                            Andy0422

rotation+gptq data

@HandH1998 Hi，thank you for your kindly help. I encountered another problem with the calibration data, from my test result as following, the results with wikitext2 seems ok, and the results...

rotation+gptq data

> @Andy0422 We used pile for smoothing and wikitext2 for gptq in our paper. But the current code has fixed this issue to use the same dataset for both smoothing...

rotation+gptq data

> @Andy0422 It is probably correct. @HandH1998 One more question, do you employ the online Hadmamad transform before the down_proj or ignore all the online transform in your implementation? If...

fp8 quantization with FSDP2 error

I also met this problem... Many thanks!

fp8 quantization with FSDP2 error

> I also met this problem... Many thanks! My problem is when I employ torchao to quantization Wan2.1 model, it is incompatible with FSDP.

Evaluation Setup for Reasoning Model

> [@bys0318](https://github.com/bys0318) I was trying to reproduce the results for deepseek-r1. May I know what value for `max_new_tokens` you used? because the default `128` results in a cutoff on the...

Evaluation Setup for Reasoning Model

> Hi, for reasoning models such as OpenAI o1 and DeepSeek R1, the w/ CoT setting is not necessary, as these models automatically output their thinking process whether prompted or...

[bugs and help need] I followed main readme page guideline, there is connection error and model running error

> > I also encountered this problem, did you solve it? Thank you > > port error how to solve it. thx.

Condition to achieve linear speedup?

> @brisker It it normal that w4a8 first-token is slower than w8a8, since the additional dequant operation (on slower cuda core) of w4a8 slows down tha main loop, even though...

[Quantization + FSDP] Support `quantize_()` for DTensor

@jerryzh168 is there any update for this issue? Cheers！