Andy0422 comments

Results 14 comments of


                                            Andy0422

[Quantization + FSDP] Support `quantize_()` for DTensor

> see some examples in https://github.com/pytorch/ao/blob/main/test/float8/test_fsdp.py > > we'll be using `quantize_` API everywhere, but maybe not yet for > > [ao/test/float8/test_fsdp.py](https://github.com/pytorch/ao/blob/137b0795acb3282ce622948b1537e20914186eea/test/float8/test_fsdp.py#L88) > > Line 88 in [137b079](/pytorch/ao/commit/137b0795acb3282ce622948b1537e20914186eea) > >...

Question on rotation

> @cli99 We share the wikitext2 PPL results below. > > Granularity Method Llama2-7b Llama2-13b Llama3-8b > per-channel smooth + GPTQ 5.9683 5.2091 7.4474 > per-channel rotation + GPTQ 5.6872...

Ruler QA tasks do not work for `max_seq_lengths` < 4096

> when `max_seq_lengths` is set to 2048, the program will hang on a `while true` loop forever. 4096 or beyond works normally Yes, 4096 is the shortest length in RULER.

为什么推理的时候推多次总有100多条样本答案为空

> 没改pred.py的逻辑，只修改为本地读取模型，使用qwen2.5-instruct，但是推理多次，总有100多条样本答案为空，请问这是为啥啊？同样的问题

Andy0422

[Quantization + FSDP] Support `quantize_()` for DTensor

Question on rotation

Ruler QA tasks do not work for `max_seq_lengths` < 4096

为什么推理的时候推多次总有100多条样本 答案为空

为什么推理的时候推多次总有100多条样本答案为空