ARC_Kaggle
ARC_Kaggle copied to clipboard
the huge gap between train/eval and test
Thanks for your contribution! Such nice work on ARC!
However, there is a huge gap between the performance of train/eval and the test. On train/eval, the code can achieve 60%-80% accuracy, while only 20% is achieved on the test reported by Kaggle.
What could possibly account for this difference?
Hi! Sorry to answer this late, I'm only seeing your message now!
There are two main reasons for this gap:
- The train set is easier than the eval and test sets. This feature was disclosed by the organisers of the ARC challenge from the beginning.
- I did look at the tasks in the train and eval sets to get inspiration about which functionalities to add to my program, so probably I was biased.