auto-cot
auto-cot copied to clipboard
GSM8K few-shot sampling from test set?
Excellent work! However, in your paper page 19, Appendix D, you show automatically constructed demonstrations for GSM8K. However, I find that these 8 cases are from test.jsonl but not train.jsonl. Is there a data leak problem?
Hi, there would not be a data leak problem because no gold label is used. We only collect questions for automatic rationale generation.