Create dependency injection metadata index
The Log4j annotation processor should be enhanced to generate and store metadata related to the dependencies of plugins. This should help with implementing https://github.com/apache/logging-log4j2/issues/3055 as dependency information needs to be scanned for reflection reachability purposes anyways.
This metadata should be sufficient for performing dependency injection on those classes without runtime discovery. While the initial implementation will rely on reflection to create and configure plugin instances, this will not require scanning over code reflectively to find members based on programmatic considerations and can rely on the previously created index. This index should replicate the dependency injection rules defined by InstanceFactory. Such an index would make it possible to add better checking of Log4j plugins for correctness at build time and when parsing a configuration.
For example, if plugin P has a required attribute A and a dependency of type D, then the metadata for P should note its injection points, and it should also have metadata (or generated code) to specify how to bind to injection points. For reflection purposes, this would indicate which IPs are fields, which IPs are part of an injectable constructor, and which IPs are part of methods. Whatever approach is used here should remain flexible so that a code generation approach can be used as an alternative to the reflective approach in a later feature.
@jvz, thanks so much for helping with #3055. 💯
#3053 already delivered the addition of reachability-metadata.json to all modules in 2.x, and #3055 created to port this work to main. This new ticket (i.e., #3875) suggests introducing an enhancement to main to "help with" #3055. If my understanding is correct so far,
- Can we port #3053 to
mainwithout this enhancement? - If yes, wouldn't it be preferable to use the same mechanism employed in
2.xalso inmainto ease future (backward and forward) porting needs?
I've been investigating this based on what was done to support it in 2.x. The main difference in main is that there are more potential reflection sites to index than in 2.x. For example, in main, we also want to know about an @Inject-annotated constructor, any of its @Inject-annotated methods and fields, and the set of plugin-related annotations. As I started considering how to model this based on the 2.x processor, I realized that the data being indexed can be described as a sort of dependency injection metadata index. In any case, yes, this is based on the technique used in 2.x.
What I plan to do on the implementation side is that this DI metadata will be used to generate the GraalVM reachability metadata, and in the future, this may also be useful for generating code for building plugins so we can avoid reflection entirely. And while I haven't verified this yet, it might be simpler to write code to transform one tree of classes into another tree of classes to dump to JSON than it is to write two separate annotation processors.
The consideration of how 2.x plugin scanning works is an interesting one to consider. Let me see what's possible here to design this such that it can consist of largely the same code in both branches. If we're ok with changing some internal details on how plugins work in 2.x, then we could even port some updates from main to 2.x.
What I plan to do on the implementation side is that this DI metadata will be used to generate the GraalVM reachability metadata, and in the future, this may also be useful for generating code for building plugins so we can avoid reflection entirely. And while I haven't verified this yet, it might be simpler to write code to transform one tree of classes into another tree of classes to dump to JSON than it is to write two separate annotation processors.
What I’d suggest is tackling these problems incrementally:
-
Generate GraalVM reachability metadata directly in the
GraalVmProcessor. There’s no strong need to introduce another layer of indirection for this. Since generating the proposed injection metadata will already require an annotation processor to process@Injectannotations, the additional intermediate step seems avoidable. -
Defer the broader enhancement until we can evaluate it in the context of existing tooling:
- Established, general-purpose dependency injection frameworks.
- Call graph generators. Dependency injection is a common challenge when determining method reachability, not just for GraalVM. It may be best to wait until @openrefactorymunawar’s Java call graph generator is released, then align on a metadata format that could serve both GraalVM and the call graph generator.
Also, note that Log4j already has a custom metadata format for documenting plugins (Log4j Docgen). If needed, we could extend that format rather than inventing an entirely new one.
As I looked more into how to implement this, I believe the GraalVM metadata should be indexed first. I agree on deferring the broader enhancement as that is something that might end up being easier to do later. And the metadata format is very much the sort of thing I was hoping we might have; that looks like an excellent basis for the future work.