wn icon indicating copy to clipboard operation
wn copied to clipboard

Validation of LMF IDs when adding lexicons

Open goodmami opened this issue 4 years ago • 1 comments

SQL errors can be hard to understand, especially for a user of Wn. To avoid these, some validation of the LMF files should be performed, such as ensuring that all IDs referenced are provided by the document (including as external elements in extension lexicons).

There is already some validation, e.g., of allowed part of speech values, relation types, etc, but so far nothing regarding entity linking. Things like cycle detection are probably too expensive to do during add, though.

goodmami avatar Feb 17 '21 08:02 goodmami

Validation is now handled with wn.validate, so I'm pulling this off the v0.9.0 milestone to avoid holding it up. I think we need to to re-evaluate what to do here. One option: catch SQL errors on adding a lexicon and suggest to the user to validate it first. For this to be effective, I think we need an easy way to load an LMF lexicon from a file stored in the cache so it could be passed to wn.validate.validate().

goodmami avatar Nov 18 '21 00:11 goodmami