CALM
CALM copied to clipboard
Impact of using widely referenced open source data sets
New to the field and might be completely off the mark here - but was any consideration given to the fact that because the data sets used are fairly widely referenced and repeated that they might have formed part of the original foundational training data for the models and this might have boosted model performance vs. using a novel dataset?