java-agent icon indicating copy to clipboard operation
java-agent copied to clipboard

Class Loaders and Common Dependencies

Open keitwb opened this issue 7 years ago • 13 comments
trafficstars

Some projects use their own class loader (e.g. spring boot, servlet servers) that is a child of the standard Java system classpath class loader, which is in turn a child of the bootstrap and ext class loaders.

Things get pretty hairy when you bundle the java agent with the OT instrumentation libraries and those instrumentation libs contain dependencies that are provided by these child class loaders (e.g. the web-servlet-filter instrumentation depends on javax.servlet-api, which I have noticed is sometimes only available in a "custom" class loader provided by the framework).

If I'm understanding it correctly, classes loaded by a parent class loader cannot "see" classes that only exist in child class loaders. The java agent runs in either the bootloader or system class loader (depending on the Boot-Class-Path manifest option) and thus can never see provided deps in non-standard class loaders. Moreover, it is not possible to simply create rules to use the custom class loader to load the instrumentation libraries because class loaders defer up the hierarchy first, which means the libs will be loaded from the agent bundle in the same manner.

I think this means that the agent either needs to include the provided dependencies itself (which is limited to only a single version of those dependencies, which increases risk of dependency mismatch with the instrumented application), or the instrumentation libraries need to be bundled separately from the agent and then loaded via the custom class loader via a custom rule/code.

Has anybody else dealt with this before or has any insights/thoughts?

keitwb avatar Oct 03 '18 14:10 keitwb

Yes I have found the same problem, and causes issues when trying to use the agent with JEE servers.

Unfortunately this is a side effect of using the rules to bootstrap the framework instrumentations, which is good from a reuse perspective.

However if the same instrumentation was fully encoded in the rules we would not have this issue with the class dependencies - but on the other hand we would be duplicating the framework instrumentations and therefore increasing the maintenance burden.

objectiser avatar Oct 03 '18 14:10 objectiser

You guys have hit on the exact area I'm focusing my attention. The fundamental problem is that the target 3rd party libs intended for instrumentation may be loaded by custom class loaders. For the instrumentation to link correctly, the instrumentation libs need to be loaded into the same class loader as where the target 3rd party lib is loaded. The logic necessary to accomplish this is currently missing from java-agent, but I believe it is doable with the help of Java's instrumentation APIs.

The fact that instrumentation libs provide the target 3rd party libraries is also a problem. The instrumentation libs do not need to provide the target 3rd party libraries, because these libraries must already be present in the application's runtime for the instrumentation to be necessary anyway. The proper way to fix this would be to have the POMs in the instrumentation libs use the "provided" scope for the 3rd party libs, but this would require cooperation from each developer of each instrumentation lib. As a workaround, I specify in the dependency spec when including instrumentation libs. For your reference, this POM has the spec for each OT instrumentation lib under opentracing-contrib.

safris avatar Oct 03 '18 15:10 safris

cc @adinn

Hi Andrew, do you have any recommendations on how to deal with this scenario?

objectiser avatar Oct 03 '18 16:10 objectiser

Hi Gary,

This problem has been noted :-). The biggest difficulty is that the mechanism needed for resolving classes is determined by the nature of the classloading model employed in the app that the agent is deployed into. So, one size will definitely not fit all.

This problem has been addressed by the Byteman agent in the context of JBoss Modules based apps. The agent supports a JBoss Modules-specific 'module plugin' that resolves IMPORT requests in the body of a rule (or at script global level). The IMPORT statements employ a plugin-specific syntax for the text following the IMPORT statement (you may insert more than one). In the abstract, IMPORTs specify extra linkage for resolution of types encountered during rule type checking. How that works is determined by the plugin. It takes the IMPORT set for a given rule and creates + hands back to the type checker a classloader which delegates to a suite of other classloaders to resolve types. That classloader suite necessarily includes the loader of the injection target class.

So, if you want some way of importing types into the target context for a rule in an app that uses a different class loader model you might take a look at the JBoss modules plugin and see if you rework it to provided something similar for your deployments. I'd be happy to provide advice on this if you have any questions.

regards,

Andrew Dinn

adinn avatar Oct 03 '18 16:10 adinn

@adinn Thanks for the information - I'll have to try it out.

Not sure if it would be an issue, but if we have a common framework we want to instrument (e.g. okhttp), then not sure how we could provide a rule that could be used in multiple environments that have custom classloaders?

Can a single rule file have multiple IMPORT statements, where only one may actually apply in the runtime being instrumented? i.e. are IMPORT statements ignored if the plugins can't resolve them, assuming that another plugin may be able to?

objectiser avatar Oct 03 '18 16:10 objectiser

Hi Gary,

At the moment there is only room for one module plugin to handle imports. However, I have been considering -- if the demand ever arose -- the option of providing multiple plugins (in fact I have been hoping this need might arise). The idea is to let he plugins fight over who gets to handle an import using whatever keywords/syntactic variants they choose to distinguish in the text following the IMPORT keyword to make it unambiguous that handling by a specific plugin is required.

It would not require much to enable multiple plugins to be specified at agent init and to modify the plugin API to allow them to acknowledge or disown a specific import request. Whatever disambiguation process might be needed in order to provide such an ack/nack would be purely at the discretion of the plugins. So, I'm happy to play ball and upgrade Byteman if you want to investigate this option :-)

regards,

Andrew Dinn

adinn avatar Oct 04 '18 07:10 adinn

As regards ignoring IMPORTS at the moment there is no real policy over that. A plugin gets passed the target class's classloader and all the IMPORTs applicable to a rule (i.e. the union of all IMPORTs active at script global scope and all IMPORTs embedded in the rule) and returns a classloader which satisfies them. If it chooses to it could ignore some IMPORTs. The JBoss Modules plugin throws errors but that is its choice.

With multiple plugins and associated IMPORT syntaxes there are a variety of options. Each plugin could get a separate bite at the full IMPORTs list and the first to return a classloader wins. The plugins could each remove any IMPORTs it likes and pass on the remainder set plus an updated classloader to the next plugin for further customization and so on. In that case any IMPORTs left over could be ignored or could throw an exception. It all depends what is wanted. A variety of policies could be made available if needed. If oyu can come up with a useful model I'll be happy to see what can be implemented.

adinn avatar Oct 04 '18 08:10 adinn

@adinn Ok thanks, sounds like a promising idea to solve the problem. I just need to find some time to do more investigation :)

@keitwb @SevaSafris Do you have specific runtimes you have been looking at? Would be good if we could come up with some reproducers to test out this idea.

objectiser avatar Oct 04 '18 09:10 objectiser

I have been looking at getting the Spring Cloud instrumentation fully bundled with the Java agent. The classloader issues I had there were from:

  1. the resources that Spring Boot loads from resources on app initialization, such as the TraceEnvironmentPostProcessor referenced in a spring.factories resource for the core instrumentation starter (I copied and merged all of them into a bundle using the maven shade plugin). Because that class exists in the system classpath it cannot access the class that it implements (EnvironmentPostProcessor) since that can only be loaded within the Spring Boot custom class loader. I think that particular issue is outside of the purview of Byteman since there are no rules involved, only META-INF resources.

  2. Even when I didn't include the META-INF resources from OT's java-spring-cloud and just use the existing web-servlet-filter rules in the existing Java agent, I run into class loader issues when the rules run since java.servlet.* classes only exist in the Spring Boot nested JARs. If I'm understanding the IMPORT mechanism correctly, I think that could handle this problem if we had a proper Spring Boot ModuleSystem implementation for Byteman.

What is confusing to me about this is that the Spring Boot class loader has nothing to do with Java Modules proper, but I assume the Byteman ModuleSystem system can work with anything as long as it uses a ClassLoader. I'm going to take a stab at an implementation if I get some time in the coming week.

I was also looking at JEE servlet servers with respect to getting the OT JAX-RS instrumentation working automatically. Glassfish specifically, but I'm trying to make this work with JBoss as well (which I suppose will be easier since Byteman already has JBoss module support).

keitwb avatar Oct 04 '18 21:10 keitwb

" If I'm understanding the IMPORT mechanism correctly, I think that could handle this problem if we had a proper Spring Boot ModuleSystem implementation for Byteman"

I think that's probably true. essentially you get to build a custom classloader to resolve classes at the point of injection. So, the IMPORT statement is a way of specifying which loaders to delegate to (i.e. it operates as a 'module' import for some notion of module. Of course, there are caveats.

  1. You can link rule code to the target classes you have imported and access their data/invoke their methods. However, those methods still remain type-checked firmly within their own classloader scope For example, with the JBoss Modules plugin you can import the transactions impl module into a rule injected into an EJB method and refer to TransactionImple, say, to get the current TX stats -- but if you want to pass the EJB in a TransactionImple method call it has to be as an Object because the TX does not know about EJBs. Ditto for any helper you provide -- it will only be able to reference TransactionImple in its methods if it is deployed in a loader which can see TransactionImple. (Yeah classloaders and modules are hard)

  2. If you import things willy-nilly you can always land yourself in classloader hell by linking to the wrong version of the same class that is available in different loader paths (essentially, try not to create diamonds in the loader delegation hierarchy :-).

adinn avatar Oct 05 '18 07:10 adinn

Just as a heads-up I am off on PTO today for 2 weeks. So, please do go ahead and play with this stuff and I will be sure to provide help but only after a short hiatus. If you do want more detailed help I suggest posting to the Byteman forum (http://community.jboss.org/en/byteman?view=discussions) but I'll also pick up stuff from here if need be.

adinn avatar Oct 05 '18 07:10 adinn

Ok thanks for the info - we definitely need to try some things out to see whether the caveats would be an issue.

In terms of a possible solution - I think we have two options:

  1. Try to use the module system of the target platform (e.g. jboss modules) and then use the relevant IMPORT statements in the rules - so one (set of) IMPORT statement per target platform/module system

  2. Use a generic approach that could be used with (almost) any target platform and only require a single IMPORT statement

So for (1), using jboss-modules as an example we would need to package up the framework instrumentation libraries as jboss modules and then reference them in the rules by their module name.

However for (2), we could package the instrumentation jars along side the java-agent jar, and reference them via IMPORT that references the jar name (for example). So would be target platform independent.

The issues with (2) are that:

  • it may not work with spring boot, where some additional registration of components in the spring context may need to occur (beans, post processors, etc). However may be additional rules could be used in these cases.
  • using the module system approach would resolve transient dependencies, which listing/loading single jars would not, so may require more jars to be listed in IMPORT statements

objectiser avatar Oct 05 '18 08:10 objectiser

Linking this discussion for reference: https://developer.jboss.org/message/984701#984701

keitwb avatar Oct 09 '18 21:10 keitwb