logchar is defined as char in Windows environments
Though char is not UTF-8 in Windows environments, LOG4CXX_CHAR:STRING=utf-8 is set and logchar is defined as char, when building with vcpkg. In Japanese Windows environments, char is Shift_JIS. Despite claiming UTF-8, log messages are output as-is without any conversion, so Japanese messages are correctly logged in Shift_JIS, but Unicode-specific characters cannot be logged.
We think LOG4CXX_CHAR:STRING should be set wchar_t and logchar should be defined as wchar_t in Windows environments.
version: 1.4.0 vcpkg tag: 2025.04.09
It looks like this changed here: https://github.com/apache/logging-log4cxx/commit/45077480eb6af9bb222c554cad7e2c02d1951fc6
You should be able to work around this by setting LOG4CXX_CHAR=wchar_t when configuring, although I'm not sure how to configure that with vcpkg at the moment.
It looks like this changed here: 4507748
You should be able to work around this by setting
LOG4CXX_CHAR=wchar_twhen configuring, although I'm not sure how to configure that with vcpkg at the moment.
It seems that the way to fix this is for you to use an overlay port in vcpkg, as we should not use features to implement alternatives. There are some directions on installing locally modified dependencies where you can set the LOG4CXX_CHAR option in the vcpkg port file.
@swebb2066 I don't think there's much for us to do here, this seems like a rather specific to Japanese. From the wikipedia page it sounds like Shift_JIS is not UTF-8 nor UTF-16, but a different 16-bit encoding entirely. I think the best thing for us to do here would be to provide an example overlay file for vcpkg to let users set their encoding instead of trying to figure it out on our own.
but Unicode-specific characters cannot be logged
@WorldRobertProject please provide a code snippet that shows how you would like to log Unicode-specific characters.
Are you using Qt?
Can you to advise if I should apply this patch to Log4cxx:
diff --git a/src/main/include/log4cxx-qt/transcoder.h b/src/main/include/log4cxx-qt/transcoder.h
index 0b1aaab4..43ad6447 100644
--- a/src/main/include/log4cxx-qt/transcoder.h
+++ b/src/main/include/log4cxx-qt/transcoder.h
@@ -31,7 +31,7 @@
@param src The QString variable.
*/
#define LOG4CXX_DECODE_QSTRING(var, src) \
- LOG4CXX_NS::LogString var = (src).toStdString()
+ LOG4CXX_NS::LogString var = (src).toUtf8().constData()
/** Create a QString equivalent of \c src.
@@ -43,7 +43,7 @@
@param src The log4cxx::LogString variable.
*/
#define LOG4CXX_ENCODE_QSTRING(var, src) \
- QString var = QString::fromStdString(src)
+ QString var = QString::fromUtf8(src.c_str())
#endif // LOG4CXX_LOGCHAR_IS_UTF8
#if LOG4CXX_LOGCHAR_IS_WCHAR
@rm5248 For now, this issue has been resolved by the overlay port. Thank you so much.
I realized that there is still an issue caused by character encoding. The character type for file names in LocationInfo is char. File names that contain non-ASCII characters are not correctly logged. We think the character type for file names should be logchar or another typedef type. That said, file names with non-ASCII characters are basically never used, so it will be an issue as long as only short file names are logged. (Directory names may contain non-ASCII chacacters.)
The character type for function names being 'char' is maybe not an issue, but the current C++ standard allows function names to include non-ASCII characters. (while it is not common...) It might be better changed to logchar or another typedef type, too, but the severity may be low.
@swebb2066 Sorry, we don't use Qt.
Because conversion from Shift_JIS to UTF-8 typically goes through UTF-16, UTF-16 is more efficient than UTF-8.
We need to integrate with C# in our custom appender, and C#'s char is UTF-16. This is another reason why we want to use wchar_t.
Here's how we use an overlay port:
- Copy
vcpkg\ports\log4cxxto another place - Add an option
-DLOG4CXX_CHAR=wchar_ttovcpkg_cmake_configurein the copiedportfile.cmake - Build as
vcpkg install log4cxx --overlay-ports=<copied folder path>