rageval icon indicating copy to clipboard operation
rageval copied to clipboard

Data quality metrics

Open faneshion opened this issue 11 months ago • 0 comments

In this issue, we discuss the potential metric used to evaluate the quality of an input dataset. The quality of dataset is very important since there are many automatically generated dataset built on ChatGPT.

Some potential quality dimensions related to a QA dataset may be: 1) the number of instances; 2) the diversity of the question topic; 3) the length of the answer; 4) the fluency of the answer; 5) the correctness of the answer; 6) the length of contexts; 7) the relationship between the context and the answer to a question.

we can dicuss it in this issue.

faneshion avatar Mar 05 '24 12:03 faneshion