hunspell-merge icon indicating copy to clipboard operation
hunspell-merge copied to clipboard

Exception when parsing Mozilla italian dictionary file

Open lorenzos opened this issue 2 years ago • 1 comments

Mozilla's Italian dictionary file contains some "comments" on top (line starting with /) which cause a parsing error when merging dictionaries:

java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0
	at hunspell.merge.DicReader.readLine(DicReader.java:32)
	at hunspell.merge.FileReader.readFile(FileReader.java:29)
	at hunspell.merge.DictionaryFile.readFiles(DictionaryFile.java:113)
	at hunspell.merge.HunspellMerge.createDictionariesImpl(HunspellMerge.java:339)
	at hunspell.merge.HunspellMerge.access$1500(HunspellMerge.java:25)
	at hunspell.merge.HunspellMerge$9$1.run(HunspellMerge.java:323)
	at java.base/java.lang.Thread.run(Thread.java:829)

I think the issue is that hunspell-merge searches for lines in the form word/FLAGS and expects the word part to always be non-empty. Here are the top lines of the Italian file that causes the exception:

95421
/ "Dizionario italiano" add-on for Mozilla products.
/
/ Forked from: "Estensione linguistica italiana - Italian Writing Aids
/ extension" version 5.1, see README.txt for more details.
/
/ Copyright (C) 2001, 2002 Gianluca Turconi
/ [...]
/
/ You should have received a copy of the GNU General Public
/ License along with the "Estensione linguistica italiana - Italian
/ Writing Aids extension"; if not, see <http://www.gnu.org/licenses/>.
a
ab
abaco/OTqr
Abacuc
abadessa/QTUqrs

I think hunspell-merge should be able to simply ignore these.

lorenzos avatar Oct 09 '21 21:10 lorenzos

Thank you for the report! Unfortunately I am unable to help.

This was created by other people and I have used it years ago. When the Google Code was closed, I preserved the original repository by migrating it to GitHub.

I am incapable of maintaining this code, so you’d have to address any issues by yourself or to find a Java developer for that.

I’ve added this information to the ReadMe and will now archive this repository to avoid confusion in the future.

arty-name avatar Oct 13 '21 09:10 arty-name