ReFinED
ReFinED copied to clipboard
Some questions about training dataset
Great work!
I executed the following command and obtained the data file named wikipedia_links_aligned_spans.json in the folder ~/.cache/refined/datasets.
python3 src/refined/training/train/train.py --experiment_name test
I have two questions regarding this file:
- Is
wikipedia_links_aligned_spans.jsonthe training data? - If so, which fields are used for training? I found three fields in the
wikipedia_links_aligned_spans.json, which arehyperlinks_clean,hyperlinks, andpredicted_spans. I'm not familiar with this three fields and I'm unsure how to proceed with obtaining the training data.
Thanks !