GPTQ-for-LLaMa potential Mistakes in the test data selection for perplexity evaluation

potential Mistakes in the test data selection for perplexity evaluation

Open Green-Sky opened this issue 2 years ago • 0 comments

ptb_text_only uses the validation file instead of the test file. while it is still from the same dataset, and should result in similar results, makes 1 to 1 comparisons difficult. c4 only has validation, so that is fine.

wikitext-2 uses test https://github.com/qwopqwop200/GPTQ-for-LLaMa/blob/468c47c01b4fe370616747b6d69a2d3f48bab5e4/datautils.py#L13

ptb_text_only uses validation https://github.com/qwopqwop200/GPTQ-for-LLaMa/blob/468c47c01b4fe370616747b6d69a2d3f48bab5e4/datautils.py#L35

c4 uses validation https://github.com/qwopqwop200/GPTQ-for-LLaMa/blob/468c47c01b4fe370616747b6d69a2d3f48bab5e4/datautils.py#L59-L61

please correct me if this is intended. :)

Mar 19 '23 15:03 Green-Sky

GPTQ-for-LLaMa GPTQ-for-LLaMa copied to clipboard

potential Mistakes in the test data selection for perplexity evaluation

GPTQ-for-LLaMa
GPTQ-for-LLaMa copied to clipboard