rageval
rageval copied to clipboard
Data quality metrics
In this issue, we discuss the potential metric used to evaluate the quality of an input dataset. The quality of dataset is very important since there are many automatically generated dataset built on ChatGPT.
Some potential quality dimensions related to a QA dataset may be: 1) the number of instances; 2) the diversity of the question topic; 3) the length of the answer; 4) the fluency of the answer; 5) the correctness of the answer; 6) the length of contexts; 7) the relationship between the context and the answer to a question.
we can dicuss it in this issue.