FLAN icon indicating copy to clipboard operation
FLAN copied to clipboard

Where can I obtain a generated dataset that includes an options column

Open nanyyyyyy opened this issue 1 year ago • 7 comments

Where can I obtain a generated dataset that includes an options column, which can be used for rank evaluation purposes? Thank you.

nanyyyyyy avatar May 28 '23 22:05 nanyyyyyy

@nanyyyyyy You would need to re-generate it and pass through an options column for relevant datasets. This could cost a bit of compute though. Alternatively you could isolate the options datasets and use a regex to extract them.

Sorry, this data was intended primarily for training so we didn't pass that information along. Hope this helps though!

shayne-longpre avatar Jun 02 '23 15:06 shayne-longpre

Can you explain a bit about this? I want to include options and the exact template for generating each instance in the dataset. What are the detailed steps to achieve this?

gao-xiao-bai avatar Jun 16 '23 08:06 gao-xiao-bai

Can you explain a bit about this? I want to include options and the exact template for generating each instance in the dataset. What are the detailed steps to achieve this?

I haven't figured it out. sorry

nanyyyyyy avatar Jun 17 '23 02:06 nanyyyyyy

@nanyyyyyy @gao-xiao-bai So to generate all the templates and options alongside each example you would need to edit the preprocessors used for every task.

One in particular is the formatter (here) which is what applies the pattern (or "template") to each example. You could create a function like this one to store the pattern as a field, and make sure its passed all the way through to the final generated examples by adding to the list of passthrough fields here.

To get the answer options you would do the same thing, passing through the "options"key in each example, for the datasets that have the format_options preprocessor (see here).

shayne-longpre avatar Jun 17 '23 19:06 shayne-longpre

@nanyyyyyy @gao-xiao-bai So to generate all the templates and options alongside each example you would need to edit the preprocessors used for every task.

One in particular is the formatter (here) which is what applies the pattern (or "template") to each example. You could create a function like this one to store the pattern as a field, and make sure its passed all the way through to the final generated examples by adding to the list of passthrough fields here.

To get the answer options you would do the same thing, passing through the "options"key in each example, for the datasets that have the format_options preprocessor (see here).

This is super helpful. thanks a lot

nanyyyyyy avatar Jun 17 '23 21:06 nanyyyyyy

@nanyyyyyy @gao-xiao-bai So to generate all the templates and options alongside each example you would need to edit the preprocessors used for every task.

One in particular is the formatter (here) which is what applies the pattern (or "template") to each example. You could create a function like this one to store the pattern as a field, and make sure its passed all the way through to the final generated examples by adding to the list of passthrough fields here.

To get the answer options you would do the same thing, passing through the "options"key in each example, for the datasets that have the format_options preprocessor (see here).

Thank you for your response.

gao-xiao-bai avatar Jun 18 '23 03:06 gao-xiao-bai

@nanyyyyyy @gao-xiao-bai were you guys able to figure this out?

@shayne-longpre I must say it's a little weird not to include the options since FLAN paper evaluations are based on rank-classification with options, so it seems like a key thing to include. The data is appreciated nonetheless.

a-antoniades avatar Sep 04 '23 13:09 a-antoniades