textidote
textidote copied to clipboard
n-grams analysis (using `--languagemodel`) gives `java.util.ServiceConfigurationError`
When running the following command (file read from stdin
):
textidote --languagemodel /path/containing/fr/directory --html --dict .ltignore --check fr
I end up with the following error:
Using N-grams from /home/simon/Téléchargements
TeXtidote v0.7 - A linter for LaTeX documents and others
(C) 2018-2019 Sylvain Hallé - All rights reserved
Exception in thread "main" java.util.ServiceConfigurationError: Cannot instantiate SPI class: org.apache.lucene.codecs.lucene50.Lucene50Codec
at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:82)
at org.apache.lucene.util.NamedSPILoader.<init>(NamedSPILoader.java:51)
at org.apache.lucene.util.NamedSPILoader.<init>(NamedSPILoader.java:38)
at org.apache.lucene.codecs.Codec$Holder.<clinit>(Codec.java:47)
at org.apache.lucene.codecs.Codec.forName(Codec.java:113)
at org.apache.lucene.index.SegmentInfos.readCodec(SegmentInfos.java:469)
at org.apache.lucene.index.SegmentInfos.readCommit(SegmentInfos.java:361)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:53)
at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:50)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731)
at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:50)
at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel$LuceneSearcher.<init>(LuceneSingleIndexLanguageModel.java:242)
at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel$LuceneSearcher.<init>(LuceneSingleIndexLanguageModel.java:230)
at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel.getCachedLuceneSearcher(LuceneSingleIndexLanguageModel.java:183)
at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel.addIndex(LuceneSingleIndexLanguageModel.java:119)
at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel.<init>(LuceneSingleIndexLanguageModel.java:93)
at org.languagetool.languagemodel.LuceneLanguageModel.<init>(LuceneLanguageModel.java:65)
at org.languagetool.language.French.getLanguageModel(French.java:132)
at org.languagetool.JLanguageTool.activateLanguageModelRules(JLanguageTool.java:341)
at ca.uqac.lif.textidote.rules.CheckLanguage.activateLanguageModelRules(CheckLanguage.java:241)
at ca.uqac.lif.textidote.Main.mainLoop(Main.java:546)
at ca.uqac.lif.textidote.Main.mainLoop(Main.java:124)
at ca.uqac.lif.textidote.Main.main(Main.java:110)
Caused by: java.lang.IllegalArgumentException: An SPI class of type org.apache.lucene.codecs.PostingsFormat with name 'Lucene50' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath. The current classpath supports the following names: [Lucene40, Lucene41]
at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:114)
at org.apache.lucene.codecs.PostingsFormat.forName(PostingsFormat.java:112)
at org.apache.lucene.codecs.lucene50.Lucene50Codec.<init>(Lucene50Codec.java:155)
at org.apache.lucene.codecs.lucene50.Lucene50Codec.<init>(Lucene50Codec.java:75)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at java.base/java.lang.Class.newInstance(Class.java:584)
at org.apache.lucene.util.NamedSPILoader.reload(NamedSPILoader.java:72)
... 23 more
The n-grams files were downloaded from the following URL:
https://languagetool.org/download/ngram-data/
as instructed by the LanguageTool documentation page.
Something needs to be edited inside the JAR file; solution here: https://anwaarlabs.wordpress.com/2017/02/25/lucene-an-spi-class-of-type-org-apache-lucene-codecs-codec-with-name-does-not-exist/
I can patch the existing release, but I'll have to think of a way to automate this for future releases.
Release v0.7.1 should fix the problem. Feel free to reopen if the problem persists.
I just tested 0.7.1. It worked fine! Thanks for the quick reaction!
Thank you for the great program. Unfortunately I get exactly this error when I want to use the latest version (0.8.3) with current n-gram data. Is there a solution for this?
The new LanguageTool jar seems to have the exact same issue as the previous one. I'll reopen and try to fix it again.
I’m getting this exact error, but can’t fix it even after following the instructions at https://anwaarlabs.wordpress.com/2017/02/25/lucene-an-spi-class-of-type-org-apache-lucene-codecs-codec-with-name-does-not-exist/.
Is there a workaround simple enough for non-Java users?
I "patched" the 0.8.3 version manually and it works for me. Use at your own risk. I edited it those three files in META_INF using vim.
https://transfer.sh/64Oy0T/textidote_patched.jar
@bong0 Thanks for your contribution. I would like to integrate your changes in the pipeline that creates the LanguageTool fat JAR. Could you please tell me which files you modified and what changes you made to them?
Sure, hope that helps: so I added the lines listed in the diff. Let me know if there's something unclear :) [changed] META-INF/services/org.apache.lucene.codecs.PostingsFormat
❯ diff t1/**/META-INF/services/org.apache.lucene.codecs.PostingsFormat t2/**/META-INF/services/org.apache.lucene.codecs.PostingsFormat
17a18
> org.apache.lucene.codecs.lucene50.Lucene50PostingsFormat
[changed] META-INF/services/org.apache.lucene.codecs.DocValuesFormat
❯ diff t1/**/META-INF/services/org.apache.lucene.codecs.DocValuesFormat t2/**/META-INF/services/org.apache.lucene.codecs.DocValuesFormat
20a21
> org.apache.lucene.codecs.lucene54.Lucene54DocValuesFormat
[changed] META-INF/services/org.apache.lucene.codecs.Codec
❯ diff t1/**/META-INF/services/org.apache.lucene.codecs.Codec t2/**/META-INF/services/org.apache.lucene.codecs.Codec
24a25
> org.apache.lucene.codecs.lucene54.Lucene54Codec