Log4J2: PatternLayout "charset" in documentation but forbidden by xml schema
Description
I have a config file for Log4j 2.x in XML format:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration xmlns="https://logging.apache.org/xml/ns"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://logging.apache.org/xml/ns https://logging.apache.org/xml/ns/log4j-config-2.xsd">
<Appenders>
<Console name="console" target="SYSTEM_OUT">
<PatternLayout charset="UTF-8" pattern="test"/> <!-- error here -->
</Console>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="console"/>
</Root>
</Loggers>
</Configuration>
The XML validation is telling me:
Attribute charset is not allowed here
The documentation for Log4j 2.x is telling me that an attribut charset exists in PatternLayout: https://logging.apache.org/log4j/2.x/manual/pattern-layout.html#plugin-attr-charset
Reading the website published at https://logging.apache.org/xml/ns, I get the impression that https://logging.apache.org/xml/ns/log4j-config-2.xsd is the official "configuration file xml schema".
Configuration
Version: Log4j 2.x
If I use https://logging.apache.org/xml/ns/log4j-config-2.23.1.xsd instead of https://logging.apache.org/xml/ns/log4j-config-2.xsd, charset is invalid, too.
After digging through some code I found, that the plugin descriptor log4j-core-plugins.xml still contains the charset attribute and the issue seems to be with the generation of the XSD from said descriptor. In particular, the issue seem to be https://github.com/apache/logging-log4j2/blob/f2efdd5a33e869869528d727fd26c1d2a4f0754c/pom.xml#L368 which prevents the generation of XML types for classes in the java package, which then prevents attributes with those types from being included in the generated XSD in SchemaGenerator::writePluginAttribute. After removing the exclusion, the generated XSD contains the charset attribute.
A reasonable thing would probably be to allow generation of XML types for all classes for which a TypeConverter is available, even if those classes are in the java package.
@jbb01,
Could you submit a PR with the solution?
Unfortunately, I'm not quite sure what "the solution" is. The simplest approach would be to just remove the exclusion pattern. The resulting XSD will contain the missing <simpleType> for Charset as well as a few other types (Class, InetAddress, URL, Pattern) and also the missing attributes: HttpAppender.url, ColumnMapping.columnType, ColumnMapping.type, SocketAddress.host, RegexReplacement.regex, as well as the charset attribute for 12 appenders and layouts in total. This change, however, also results in the generation of <group>s for Serializable, Cloneable, Iterable, Object, Runnable and Comparator, which is probably what the exclusion tried to avoid in the first place. If you think the upsides outweigh the downsides, I'll gladly submit a PR with this solution.
Based on this first approach, you could also explicitly exclude only the <group> types or include the attribute types (though excludes take priority over includes, so including some types from java.* while excluding all others might result in an ugly regex) to avoid generation of the groups. This approach is not maintenance-free, as new attribute or group types would have to be manually added to the list.
The in my opinion best, but probably also most difficult approach, would be to allow types based on the available TypeConverters but that would requiring knowing (or better detecting) the complete list of type converters available at build time and configure the log4j-docgen-maven-plugin accordingly. I would not know how to go about implementing this.
@jbb01 What if we exclude those particular classes in the excludePattern:
^java\.(?!lang\.Class|net\.InetAddress|net\.URL|util\.regex\.Pattern).+
(Above snippet is inspired from this SO post and not tested.)
Does this help?
This approach is not maintenance-free, as new attribute or group types would have to be manually added to the list.
I can live with this until we have a better solution.
I've created a PR (#3568) with the proposed changes. The build succeeds and verifies and the generated schema changed as expected.