REST Api Parser does not find a file if there is an accent in its name
Hi
I used the parse api with this encoding : http://192.168.0.18:9090/services/rest/parser?path=%2Fusr%2Fshare%2FDocuments%2F2016%2FPr%C3%AAtAltea.pdf
My file has an accent in its name. I receive an error from the parser which is : java.io.FileNotFoundException: /usr/share/Documents/2016/PrêtAltea.pdf (No such file or directory) - /usr/share/Documents/2016/PrêtAltea.pdf (No such file or directory) - com.jaeksoft.searchlib.webservice.parser.ParserImpl.putMagic(ParserImpl.java:203)
if i delete the accented character from the name, it works. Is there any option to activate or is it a bug ?
thanks
Marc
Hi everybody,
I answer to myself, The only thing to do, is to select the right language in start.sh script
Instead of en_US.UTF_8, i put fr_FR.UTF_8, so accents will exists in the charset.
Maybe there is a more "reliable" solution instead of hardcode a language.. But that worked for me...
Thanks
Marc
Hi Marc,
First, thanks for the workaround.
I've read some interesting Q&A issues:
http://stackoverflow.com/questions/3072376/how-can-i-open-files-containing-accents-in-java http://jonisalonen.com/2012/java-and-file-names-with-invalid-characters/ http://stackoverflow.com/questions/12109520/java-cant-see-file-on-file-system-that-contains-illegal-characters
I think that one interesting suggestion would be to move to java.nio. I currently investigate this issue.