CompareNet_FakeNewsDetection
CompareNet_FakeNewsDetection copied to clipboard
Using different training/dev data
Hi,
I'm trying to reproduce the in-domain 4-way classification part in your paper, ie. split the LUN-train into a 80:20 split to create the training and validation set, and then use the LUN-test as the in-domain test set. However, when I feed the model with randomly selected 80% of fulltrain.csv, I get the following error:
In addition, for the out-of-domain 2-way classification task, I tried to generate training and dev files that only contains “trusted” and “satirical” classes, as indicated in your paper (under Experiments, 2-way classification). Feeding such training file to the model also caused above error.
I assume adj files are adjacency matrices? It seems like these adj files are specifically designed for using fulltrain.csv as training file. Do you have an idea how I can fix it?
Or, how do we use other files for training in general? Thank you!
I also want to adapt this method on other dataset to do general training. But I found that fixed values instead of variables are used in the "data_loader.py", it's hard for me to understand what will be done in data preprocess procedure.