graal
graal copied to clipboard
Accented characters issue when using java.nio.file.Path
Describe the issue Accented characters issue when using Path.of(...)
Steps to reproduce the issue
- Using this test sample (GraalVMAccentedTest.java):
import java.nio.file.Path;
public final class GraalVMAccentedTest {
public static void main(String[] args) {
var file = Path.of("à éïôù.txt").toFile();
System.out.println("Could instantiate file " + file.getName());
}
}
- Using eclipse-temurin java 17 (locally):
javac GraalVMAccentedTest.java
java GraalVMAccentedTest
# Outputs: "Could instantiate file à éïôù.txt"
- Run a GraalVM image:
docker run --name graalvm-accented-test -it "ghcr.io/graalvm/graalvm-ce:ol9-java17" /bin/bash
- Copy the file there (from another terminal):
docker cp GraalVMAccentedTest.java graalvm-accented-test:/app/GraalVMAccentedTest.java
- Compile and run (from the 3rd step terminal):
# Fails without -encoding utf8
javac -encoding utf8 GraalVMAccentedTest.java
java GraalVMAccentedTest
- Outputs:
Exception in thread "main" java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: ?????.txt
at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:121)
at java.base/sun.nio.fs.UnixPath.<init>(UnixPath.java:68)
at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279)
at java.base/java.nio.file.Path.of(Path.java:147)
at GraalVMAccentedTest.main(GraalVMAccentedTest.java:5)
Describe GraalVM and your environment:
- GraalVM version: ol9-java17-22.3.3-b1
- JDK major version: 17
- OS: OL9
- Architecture: AMD64
More details I tried with different bases (ol7, ol8), turns out ol7 works. And doesn't need the "-encoding utf-8" when compiling:
- ghcr.io/graalvm/graalvm-ce:ol9-java17 and container-registry.oracle.com/graalvm/jdk:17-ol7 are failing
- ghcr.io/graalvm/graalvm-ce:ol8-java17 and container-registry.oracle.com/graalvm/jdk:17-ol8 are failing
- ghcr.io/graalvm/graalvm-ce:ol7-java17 and container-registry.oracle.com/graalvm/jdk:17-ol7 are OK!
docker run --name graalvm-accented-test -it "container-registry.oracle.com/graalvm/jdk:17-ol7" /bin/bash
# ...
bash-4.2# javac GraalVMAccentedTest.java
bash-4.2# java GraalVMAccentedTest
Could instantiate file à éïôù.txt
Thanks for reporting this. I will try to reproduce it on my side.
@fernando-valdez Thanks :) Just updated the description (more details section) after trying different images, turns out it works with ol7 based images.
@fernando-valdez Speaking with CentOS/RHEL on mind, regardless of what you describe as locales for the native image, the host system actually have to have langpacks installed. I hit the issue back in the day, wrote this doc:
/**
* This test in Native won't work on a barebone system,
* just with C.UTF-8 default fallback locale.
*
* For example, this package satisfies the dependency on a RHEL 9 type of OS:
* glibc-all-langpacks
*
*/
So I'd suggest checking glibc langpacks on those ol base images....
Thanks @Karm I can confirm that running:
microdnf --nobest install glibc-all-langpacks
fixes it. No more "-encoding utf-8" needed for compilation, nor exception when running.
Thanks @Manokha for confirming. And thanks @Karm for your recommendation!