Import all the CrossFit tasks

Open dirkgr opened this issue 3 years ago • 0 comments

CrossFit has a somewhat unified format for their tasks. We could use it to get a bunch of tasks with very little code.

Here is a list of patterns that @ibeltagy found in CrossFit:

classification
- plan input / output: https://github.com/INK-USC/CrossFit/blob/master/tasks/ade_classification.py
- title: .... [SEP] content: ... https://github.com/INK-USC/CrossFit/blob/master/tasks/amazon_polarity.py
- premise: ... [SEP] hypothesis: ....https://github.com/INK-USC/CrossFit/blob/master/tasks/anli.py
- observation1: ...[SEP] observation2: ... [SEP] hypothesis1: .... ..... https://github.com/INK-USC/CrossFit/blob/master/tasks/art.py
- question: .... [SEP] context: ....  https://github.com/INK-USC/CrossFit/blob/master/tasks/boolq.py
- ... [SEP] .... https://github.com/INK-USC/CrossFit/blob/master/tasks/scicite.py
-  ... and many more similar to above with different field names

text to text
- summarize: .....
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/gigaword.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/multi_news.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/reddit_tifu.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/samsum.py
	- 
- question: ... context: ... 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/adversarial_qa.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/ropes.py
	- (Most follow this template)
- question: ... [SEP] category: ... 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/jeopardy.py
	- very few follow this template
- ... [SEP] .... 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/ade_effect.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/definite_pronoun_resolution.py
- ..<question string>.. [SEP] ..<context string>.. [SEP] ..<choices>... https://github.com/INK-USC/CrossFit/blob/master/tasks/cosmos_qa.py
- <question string>. <choices>. 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/ai2_arc.py
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/hellaswag.py
	- should have been converted to classification
	- (multiple choice datasets is a huge mess)
- question: ... https://github.com/INK-USC/CrossFit/blob/master/tasks/break.py
- 

sequence tagging: 
- ... [SEP] acronym: .... https://github.com/INK-USC/CrossFit/blob/master/tasks/acronym_identification.py
- <string>
	- input: <string>
	- output: <entity> [SEP]  <entity> .... 
	- https://github.com/INK-USC/CrossFit/blob/master/tasks/limit.py
	- 
- 

regression
- review: ... https://github.com/INK-USC/CrossFit/blob/master/tasks/app_reviews.py
- https://github.com/INK-USC/CrossFit/blob/master/tasks/google_wellformed_query.py
- question: .... [SEP] context: ... https://github.com/INK-USC/CrossFit/blob/master/tasks/mocha.py
- 
Other:
- https://github.com/INK-USC/CrossFit/blob/master/tasks/numer_sense.py

Feb 25 '22 00:02 dirkgr