CALM Impact of using widely referenced open source data sets

Impact of using widely referenced open source data sets

Open rupnic opened this issue 6 months ago • 3 comments

New to the field and might be completely off the mark here - but was any consideration given to the fact that because the data sets used are fairly widely referenced and repeated that they might have formed part of the original foundational training data for the models and this might have boosted model performance vs. using a novel dataset?

Aug 15 '24 19:08 rupnic

CALM CALM copied to clipboard

Impact of using widely referenced open source data sets

CALM
CALM copied to clipboard