InfiniteBench issues

Generating Math and Code sample

Could you please provide the code for generating samples of Math.Find, Math.Calc, Code.RUN and Code.debug? I want to generate some test sample with shorter length, since my model only support...

kai-wen-yang

bug in computing scores for longdialogue_qa_eng

1

https://github.com/OpenBMB/InfiniteBench/blob/main/src/compute_scores.py#L238 1. only one reference label is used for comparison, better loop around each answer in label, e.g., label=['ECKER', 'COMMANDER BILL ECKER']; 2. prediction phrase is splitted into words for...

Xianchao-Wu

Error in loading from Huggingface

3

When I try to run the following code in colab: from datasets import load_dataset dataset = load_dataset("xinrongzhang2022/InfiniteBench") I get the following error: > DatasetGenerationCastError: An error occurred while generating the...

BenHamm

GPT-4o

1

How is GPT4 run if the API has a hard-cutoff of 128k? The EN.QA and EN.MC dataset itself looks to be more than 128k tokens by itself. Am I missing...

karansaxena

Mismatch for longbook_qa_eng

1

Are the GPT4 results evaluated on a different set of `longbook_qa_eng`? The 'ground_truth' fields in [results/gpt4/preds_longbook_qa_eng.jsonl](https://github.com/OpenBMB/InfiniteBench/blob/main/results/gpt4/preds_longbook_qa_eng.jsonl) don't seem match with ground_truth in [results/chatglm3/preds_longbook_qa_eng.jsonl](https://github.com/OpenBMB/InfiniteBench/blob/main/results/chatglm3/preds_longbook_qa_eng.jsonl)

xuandif-cmu

InfiniteBench
InfiniteBench copied to clipboard

Metadata

Generating Math and Code sample

bug in computing scores for longdialogue_qa_eng

Error in loading from Huggingface

GPT-4o

Mismatch for longbook_qa_eng

← Metadata

Owner

Metadata

InfiniteBench InfiniteBench copied to clipboard

Metadata

Generating Math and Code sample

bug in computing scores for longdialogue_qa_eng

Error in loading from Huggingface

GPT-4o

Mismatch for longbook_qa_eng

← Metadata

Owner

Metadata

InfiniteBench
InfiniteBench copied to clipboard