commons-compress icon indicating copy to clipboard operation
commons-compress copied to clipboard

Change source encoding to UTF-8

Open mkoncek opened this issue 2 years ago • 6 comments

We have a project which bootstraps Maven. It manually calls javac and we encountered a problem with source encoding. I would like to unify source encodings to UTF-8.

mkoncek avatar Jun 05 '23 10:06 mkoncek

Hello @mkoncek How can this goal be enforced such that this build fails without the change?

garydgregory avatar Jun 05 '23 11:06 garydgregory

maven-compiler-plugin is declared in commons-parent which also uses the encoding field. I rebased the PR so that the build reports an error the same way as javac would in our case. I don't know why the build doesn't fail, but there is an [ERROR] in the log if encoding is set and current sources are used.

commons-parent uses ISO-8859 encoding by default.

@sebbASF That would work too, but I believe it is time we can afford such luxury as using non-ASCII-only characters in sources.

mkoncek avatar Jun 05 '23 11:06 mkoncek

I can confirm that changing the encoding causes the compile to report an ERROR, but the build succeeds:

$ mvn clean compile -Dcommons.encoding=UTF8 ... [INFO] Compiling 399 source files with javac [debug release 8] to target/classes [ERROR] /commons/compress/src/main/java/org/apache/commons/compress/archivers/tar/TarArchiveOutputStream.java:[300,32] unmappable character (0xF6) for encoding UTF-8 ... [INFO] BUILD SUCCESS

I've tried experimenting with -Dmaven.compiler.failOnWarning=true (and failOnError), but Maven does not fail the build.

However, adding -Dcommons.compiler.fork=true does cause the build to fail. Possible bug in Maven?

sebbASF avatar Jun 05 '23 14:06 sebbASF

See https://issues.apache.org/jira/browse/MCOMPILER-491

sebbASF avatar Jun 06 '23 08:06 sebbASF

Unfortunately, when using fork=true, some informational messages are not shown, see: https://issues.apache.org/jira/browse/MCOMPILER-537

sebbASF avatar Jun 06 '23 10:06 sebbASF

How can this goal be enforced such that this build fails without the change?

Maybe by adding a UTF-8 encoded test asset, plus a test that reads the file in binary mode and compares to the expected bytes?

sschuberth avatar Jan 11 '24 07:01 sschuberth