MorphoLex-en icon indicating copy to clipboard operation
MorphoLex-en copied to clipboard

Several incorrectly formatted MorphoLexSegm

Open unendin opened this issue 6 years ago • 2 comments

Thank you for the high-quality data. A total of 5 MorphoLexSegm entries have overlapping rather than nested bracket types -- in particular, {( }) rather than {( )}. This came up when attempting to parse the field. Assuming there's no morphological rationale for the overlaps, nesting would be more consistent and readily parsed:
{(back}){(board)} > {(back)}{(board)} {(back}){(bone)} > {(back)}{(bone)} {(back}){(break)} > {(back)}{(break)} {(back}){(door)} > {(back)}{(door)} {(back}){(drop)} > {(back)}{(drop)}

unendin avatar Oct 31 '18 19:10 unendin

Thanks a lot for this report!

I'll try to fix that as soon as I get the time, and upload a new version of the database.

hugomailhot avatar Oct 31 '18 20:10 hugomailhot

I'm fixing the issues (since the dataset is otherwise fantastic!), but this one resulted in a "(back})" root on the "All roots" tab, and thus an error in the stats for these 5 words. How would you like to handle that? I suspect similar issues may arise as I fix other issues.

jonthegeek avatar Dec 02 '20 21:12 jonthegeek