wordnote icon indicating copy to clipboard operation
wordnote copied to clipboard

dictionary db question

Open rowanlend opened this issue 2 years ago • 1 comments

Hello - saw this initially on HN and was curious about the data.

Apple's Mac comes with a native dictionary that has about ~80,000 words. I opened the sqlite .db file in this project just to do some quick comparisons and noticed that this contains ~52,000 words.

Just curious about the discrepancy after having noticed in the source data you cited as: freeDictionaryAPI with about ~220,000 words. Just a quick spot check I noticed your list doesn't have hyphenated words which I think is great, but having a fairly comprehensive dictionary source would eventually be a great asset in general.

It'd be great to understand how or why you pared down the list of words to what you currently have now.

Either way, thanks for putting this together!

rowanlend avatar Aug 04 '22 02:08 rowanlend

Hi @grepsci, good question.

For some reason that I don't know why (but I already open an issue in expo to investigate it https://github.com/expo/expo/issues/18479), it's taking a long time to copy the offline DB to a proper folder in iOS. Because of that, I reduce the file size keeping only the most common words.

So right now, I have two datasets:

  • Android is using a Wikidictionary version with 136,338 words
  • iOS is using a reduced version (the one that is in the repo) with 52,000 words

zehfernandes avatar Aug 04 '22 03:08 zehfernandes