Andy0422

Results 14 comments of Andy0422

@HandH1998 Hi,thank you for your kindly help. I encountered another problem with the calibration data, from my test result as following, the results with wikitext2 seems ok, and the results...

> @Andy0422 We used pile for smoothing and wikitext2 for gptq in our paper. But the current code has fixed this issue to use the same dataset for both smoothing...

> @Andy0422 It is probably correct. @HandH1998 One more question, do you employ the online Hadmamad transform before the down_proj or ignore all the online transform in your implementation? If...

I also met this problem... Many thanks!

> I also met this problem... Many thanks! My problem is when I employ torchao to quantization Wan2.1 model, it is incompatible with FSDP.

> [@bys0318](https://github.com/bys0318) I was trying to reproduce the results for deepseek-r1. May I know what value for `max_new_tokens` you used? because the default `128` results in a cutoff on the...

> Hi, for reasoning models such as OpenAI o1 and DeepSeek R1, the w/ CoT setting is not necessary, as these models automatically output their thinking process whether prompted or...

> > I also encountered this problem, did you solve it? Thank you > > port error how to solve it. thx.

> @brisker It it normal that w4a8 first-token is slower than w8a8, since the additional dequant operation (on slower cuda core) of w4a8 slows down tha main loop, even though...

@jerryzh168 is there any update for this issue? Cheers!