Impact-of-KG-Context-on-ED
Impact-of-KG-Context-on-ED copied to clipboard
The training script for Roberta does not match the papers specifications
There are some serious discrepancies between the code and the paper.
Below is the image describing the inputs into the roberta model
However, after running the scripts, The discrepancies are:
- There is no surface mention in the training inputs. instead the Wikidata id is input, which makes no sense to the model
- Entity context is jumbled up pretty bad, the predicates and the objects are not in order at all.
- Padding token is wrong for roberta (1 -
)
Below is a sample training input:
['<s>',
'ĠQ',
'5',
'34',
'153',
'</s>',
'Ġfantasy',
'Ġnovelist',
'ĠDavid',
'ĠGem',
'm',
'ell',
'.',
'ĠAchilles',
'Ġis',
'Ġfeatured',
'Ġheavily',
'Ġin',
'Ġthe',
'Ġnovel',
'ĠThe',
'ĠFire',
'brand',
'Ġby',
'ĠMarion',
'ĠZimmer',
'ĠBradley',
'.',
'ĠThe',
'Ġcomic',
'Ġbook',
'Ġhero',
'ĠCaptain',
'ĠMarvel',
'Ġis',
'Ġendowed',
'Ġwith',
'Ġthe',
'Ġcourage',
'Ġof',
'ĠAchilles',
',',
'Ġas',
'Ġwell',
'</s>',
'Ġcountry',
'Ġof',
'Ġcitizenship',
'Ġinspired',
'Ġby',
'ĠZach',
'ary',
'ĠLevi',
'ĠUnited',
'ĠStates',
'Ġof',
'ĠAmerica',
'Ġperformer',
'ĠSuperman',
'</s>',
'<s>',
'<s>',
...