FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

数据格式问题

Open sevenandseven opened this issue 1 year ago • 2 comments

你好,在评估msmarco指标时,是将content数据变为: {"content": "A is ...", "B is ...", "C is ..."} 这种格式是吗?

每一个content后有多个候选的段落。

sevenandseven avatar May 13 '24 01:05 sevenandseven

The data format is:

{"content": "A is ..."}
{"content": "B is ..."}
{"content": "C is ..."}
{"content": "Panda is ..."}
{"content": "... is A"}

, where each line is a dict containing a text instead of a list of text

You can refer to our example data: https://github.com/FlagOpen/FlagEmbedding/blob/master/examples/finetune/toy_evaluation_data/toy_corpus.json

staoxiao avatar May 13 '24 02:05 staoxiao

The data format is:

{"content": "A is ..."}
{"content": "B is ..."}
{"content": "C is ..."}
{"content": "Panda is ..."}
{"content": "... is A"}

, where each line is a dict containing a text instead of a list of text

You can refer to our example data: https://github.com/FlagOpen/FlagEmbedding/blob/master/examples/finetune/toy_evaluation_data/toy_corpus.json

"Thank you for your reply, I have succeeded in making it."

sevenandseven avatar May 13 '24 03:05 sevenandseven