cbert_aug
cbert_aug copied to clipboard
More details about how SST-2 is prepared
The SST-2 dataset included in the repo contains 6,228 training samples, 692 validation samples, and 1821 test samples. But the official SST-2 dataset (which can be access via torchtext) contains 6,920 training binary-class samples, and 872 validation binary-class samples. What gives? Could you clarify the discrepancy?