wn icon indicating copy to clipboard operation
wn copied to clipboard

Move lexicalized column out of synsets and senses tables

Open goodmami opened this issue 9 months ago • 0 comments

The lexicalized column is 1 when a synset is lexicalized and 0 if it is not, but so far there are no instances where something is marked 0. This means there is at least 1 byte per synset and sense in the database. As a quick estimation, using a database containing all of OMW 1.4 and a single version of OdeNet and OEWN, there are 1,206,872 synsets and 2,268,820 senses, or 3,475,692 bytes (3.3MiB). If, instead, there were separate tables that only marked the unlexicalized synsets and senses, this size would be effectively 0.

goodmami avatar Apr 04 '25 23:04 goodmami