cassandra-reaper icon indicating copy to clipboard operation
cassandra-reaper copied to clipboard

Get upgrade integration tests to run using `DropwizardTestSupport` inside classloaders and reflection

Open michaelsembwever opened this issue 6 years ago • 31 comments

Project board link

It should be possible to run:

mvn -Pintegration-upgrade-tests -Dtest=ReaperCassandraIT

but trying to start the jetty/jersey server via reflection ( https://github.com/thelastpickle/cassandra-reaper/blob/master/src/server/src/test/java/io/cassandrareaper/acceptance/ReaperTestJettyRunner.java#L171 ) throws a class cast exception (around jboss' AutoDiscoverable).

When this is fixed then the upgrade integration tests can be uncommented and run again in both travis and circleCI.

The requirement to the upgrade integration tests is that the previous versions of ReaperApplication (inside a DropwizardTestSupport) is run via separated classloaders. The classloaders need to be child-first, as opposed to parent-first, for this to work. The upgrade integration tests run the cucumber scenarios against the past version of a ReaperApplication test server, and then mid-scenario replaces the ReaperApplication test server with once from the test classpath. The upgrade integration tests ensure all normal Reaper operations complete successfully while Reaper is upgraded.

┆Issue is synchronized with this Jira Story by Unito ┆Issue Number: REAP-158

michaelsembwever avatar Mar 19 '19 10:03 michaelsembwever

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 0.998 ETH (137.94 USD @ $138.21/ETH) attached to it.

gitcoinbot avatar Mar 19 '19 22:03 gitcoinbot

@michaelsembwever I've managed to build successfully (the only thing bower been needed to be installed globally through npm i -g bower. That should be stated somewhere in docs).

Seems your issue explained here: https://stackoverflow.com/questions/37145127/java-lang-classcastexception-org-glassfish-jersey-jackson-internal-jacksonautod

so, going to play with it once @gitcoinbot allow me to bid on

aahutsal avatar Mar 20 '19 14:03 aahutsal

@michaelsembwever I've managed to build successfully (the only thing bower been needed to be installed globally through npm i -g bower. That should be stated somewhere in docs).

Indeed. We've all been using bower for so long we're forgotten this is an external install. This shortcoming in the docs comes from when we made building the UI a mandatory part of the build in https://github.com/thelastpickle/cassandra-reaper/pull/415

Seems your issue explained here: https://stackoverflow.com/questions/37145127/java-lang-classcastexception-org-glassfish-jersey-jackson-internal-jacksonautod

Nice find. Note this might not be the only issue. Did you manage to get upgrade integration tests running against 1.2.2, 1.3.0, and 1.4.1 ( https://github.com/thelastpickle/cassandra-reaper/blob/master/src/server/pom.xml#L41-L43 ). The output of mvn -Pintegration-upgrade-tests -Dtest=ReaperCassandraIT should show each scenario run against each of these versions. And did you have any trouble setting up ccm so that the jmx calls from the embedded Reaper server were connecting to it? (this is a typical stumbling block for many trying to run the integration tests locally)

so, going to play with it once @gitcoinbot allow me to bid on

this is the first time for me using gitcoin. to my best understanding you have to Request to Work and that will send me an email notification to which I have to approve ??

michaelsembwever avatar Mar 20 '19 20:03 michaelsembwever

@michaelsembwever this is the first time I"m using gitcoin too. Yeah, I've to Request to Work it seems.

aahutsal avatar Mar 21 '19 05:03 aahutsal

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work has been started.

These users each claimed they can complete the work by 1 week from now. Please review their action plans below:

1) gutsal-arsen has been approved to start work.

@michaelsembwever I've managed to build successfully (the only thing bower been needed to be installed globally through npm i -g bower. That should be stated somewhere in docs).

Seems your issue explained here: https://stackoverflow.com/questions/37145127/java-lang-classcastexception-org-glassfish-jersey-jackson-internal-jacksonautod

so, going to play with it once @gitcoinbot allow me to bid on

Learn more on the Gitcoin Issue Details page.

gitcoinbot avatar Mar 21 '19 05:03 gitcoinbot

@michaelsembwever I've managed to build and run tests, but it seems fails: image

What I've done is:

  1. pull cassandra:latest via docker pull cassandra
  2. run and bind cassandra to localhost:9042 via: docker run --name some-cassandra -ti --rm -p9042:9042 cassandra:latest
  3. run server via cd src/server; java -jar target/cassandra-reaper-1.5.0-SNAPSHOT.jar server target/test-classes/cassandra-reaper.yaml
  4. run ReaperCassandraIT test via mvn test -Pintegration-upgrade-tests -Dtest=ReaperCassandraIT (by the way you missed "test" goal in your mvn -Pintegration-upgrade-tests -Dtest=ReaperCassandraIT in your message above)

Got the error above. I assume it's some misconfiguration. Since there's no much documentation on how to run the stuff, I need your help.

aahutsal avatar Mar 21 '19 06:03 aahutsal

@gutsal-arsen correct that's a misconfiguration/incorrect setup of the Cassandra service.

Offline (as I'll be asleep the next 8hrs), the easiest way to help you would be to, look into how both travis and circleCi setups up cassandra using CCM. you can also use a local Cassandra server. but it's important that jmx AND cql connections are established. and taking the ccm route will give you better parity with the CI builds.

You don't need to run the reaper service (what you did with java -jar target/cassandra-reaper-1.5.0-SNAPSHOT.jar server target/test-classes/cassandra-reaper.yaml), as that happens inside the integration tests (see ReaperTestJettyRunner)

michaelsembwever avatar Mar 21 '19 10:03 michaelsembwever

@michaelsembwever I've build Docker container for tests and trying to reproduce your issue. In the meantime could you explain the reason JMX connection fails on numerous tests attempting to connect to port 7300,7400 at 127.0.0.{2,3,4}? JMX is not running on these IPs, is that a bug or some misconfiguration from at my side again?

Causing: java.lang.RuntimeException: io.cassandrareaper.ReaperException: Failure when establishing JMX connection to 127.0.0.3:7300
at io.cassandrareaper.jmx.JmxConnectionFactory$JmxConnectionProvider.apply(JmxConnectionFactory.java:242)                                      
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)                                                         
at io.cassandrareaper.jmx.JmxConnectionFactory.connectImpl(JmxConnectionFactory.java:103) 

aahutsal avatar Mar 22 '19 09:03 aahutsal

For the non-127.0.0.1 nodes you need to set environment variable as LOCAL_JMX=no This is otherwise defined in Cassandra's cassandra-env.sh. Also see https://github.com/thelastpickle/cassandra-reaper/blob/master/src/server/src/test/resources/README.md

michaelsembwever avatar Mar 22 '19 23:03 michaelsembwever

@gutsal-arsen Hello from Gitcoin Core - are you still working on this issue? Please submit a WIP PR or comment back within the next 3 days or you will be removed from this ticket and it will be returned to an ‘Open’ status. Please let us know if you have questions!

  • [x] reminder (3 days)
  • [ ] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

gitcoinbot avatar Mar 27 '19 16:03 gitcoinbot

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


@gutsal-arsen due to inactivity, we have escalated this issue to Gitcoin's moderation team. Let us know if you believe this has been done in error!

  • [x] reminder (3 days)
  • [x] escalation to mods (6 days)

Funders only: Snooze warnings for 1 day | 3 days | 5 days | 10 days | 100 days

gitcoinbot avatar Mar 30 '19 16:03 gitcoinbot

@gitcoinbot I do have a question. It still shows me 14 days I need to complete that issue. Am I right?@michaelsembwever? If needed urgent solution, why time to complete been set to 30 days? Both, please, clarify.

aahutsal avatar Apr 04 '19 10:04 aahutsal

@michaelsembwever I'm assured I can complete. Just stuck with initial setup and snoozed to later time.

aahutsal avatar Apr 04 '19 10:04 aahutsal

fixed (snoozed) @gutsal-arsen . if you're making progress i'm happy. i don't know how to change the "Time Left" field on the gitcoin page, will ask support.

michaelsembwever avatar Apr 04 '19 22:04 michaelsembwever

@michaelsembwever looks like it's hard to mimic circleci logic fully, so I'll probably run circleci --local every time testing source changes. Since env preparation takes too much time, may I use Docker layer caching?

aahutsal avatar Apr 05 '19 06:04 aahutsal

@michaelsembwever could you please help with that?

image

This happens on circleci local execute --config .circleci/config.yml

aahutsal avatar Apr 05 '19 12:04 aahutsal

Since env preparation takes too much time, may I use Docker layer caching?

I don't know enough about Docker layer caching. Is it a premium feature? Otherwise, anything to speed up the CI builds would be much appreciated. Can we put this in as a separate issue/pr?

michaelsembwever avatar Apr 05 '19 23:04 michaelsembwever

circleci local execute --config .circleci/config.yml worked for me.

The rat plugin ensures all files have appropriate license headers. If you have added new files make sure they have the license header. Otherwise, check out the contents of /home/circleci/project/target/rat.txt to see what the error is.

michaelsembwever avatar Apr 05 '19 23:04 michaelsembwever

Docker layer caching means that all dependencies will be downloaded only once (that in theory could greatly speed up build process for next builds). Asking coz running circleci local execute --config .circleci/config.yml without it still takes lot of time due to dependency download.

Haven't added anything. Just merged to current master (that has been updated since last time I forked). I'll try circleci local execute --config .circleci/config.yml on original master and see the difference. Will let you know.

I'm off till Monday.

aahutsal avatar Apr 06 '19 04:04 aahutsal

@michaelsembwever I've seems discovered the problem. This actually is not reflection problem, this is configuration problem:

image

Any thoughts?

aahutsal avatar Apr 08 '19 13:04 aahutsal

This actually is not reflection problem, this is configuration problem.

This is a real problem. Reaper supports backward compatibility of the backend storage, but does not support forward (or any) compatibility of the configuration file. So there's no need for the test to be validating this. I will fix this for you.

michaelsembwever avatar Apr 08 '19 20:04 michaelsembwever

@michaelsembwever Plz, lemme know what commit to merge into my fork.

aahutsal avatar Apr 09 '19 01:04 aahutsal

@gutsal-arsen The trick here is for the tests to provide the forward-compatibility on the yaml configuration to tell the jackson object mapper not to fail when deserialising unknown properties… This is done with a call to bootstrap.getObjectMapper().disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES). But this has to be done via reflection in ReaperTestJettyRunner.

See https://github.com/thelastpickle/cassandra-reaper/pull/642

michaelsembwever avatar Apr 09 '19 05:04 michaelsembwever

ClassCastException been caused by different versions of spidiscovery.internal.MetaInfServicesAutoDiscoverable. One version is packaged into cassandra-reaper-x.x.x.jar, another version comes with jersey-metainf-services.jar

Same for validation.internal.ValidationAutoDiscoverable, wadl.internal.WadlAutoDiscoverable, jaxb.internal.JaxbAutoDiscoverable, logging.internal.LoggingFeatureAutoDiscoverable, filter.internal.ServerFiltersAutoDiscoverable

So far, so good. Excluded jersey-metainf-services, jersey-bean-validation and stuck. WadlAutoDiscoverable does not come as separate jar, it is the part of jersey-server, thus can not be excluded. I think at this point you should answer a) what's the reason to include *AutoDiscoverable classes into cassandra-reaper? b) can you re-package cassandra-reaper to exclude them?

Don't think ClassCastException could be solved other way #641

image

aahutsal avatar Apr 10 '19 13:04 aahutsal

I think at this point you should answer a) what's the reason to include *AutoDiscoverable classes into cassandra-reaper? b) can you re-package cassandra-reaper to exclude them?

We can't change the already previously released versions of cassandra-reaper. This would only be something we could do moving forward. If Reaper still runs ok without it then it should be excluded anyway.

One version is packaged into cassandra-reaper-x.x.x.jar, another version comes with jersey-metainf-services.jar

In the beginning of ReaperJettyTestSupport's constructor can we classload the correct WadlAutoDiscoverable and AutoDiscoverable using the appClass's classloader? Or are some of these classes already loaded in the current classloader by the static/import references? Is the a way to exclude them from the test classpath instead?

michaelsembwever avatar Apr 10 '19 21:04 michaelsembwever

@michaelsembwever ClassCastException happens somewhere in Dropwizard. We can guess what to load now. But tomorrow there will be another situation and our guess will fail. This is not the solution, man.

aahutsal avatar Apr 11 '19 04:04 aahutsal

But tomorrow there will be another situation and our guess will fail.

"Tomorrow" depends on us upgrading the version of dropwizard-testing. I'm fine with that limitation. And this is the approach I think I favour. It makes sense too, everything related to the server has to be loaded through the versioned classloader. That implies that those classes can be referenced and loaded in the normal test classloader.

Another approach could be to not load certain packages through the ChildURLClassLoader. For example, always loading AutoDiscoverable from the system classloader. But I don't know if that approach will work or just become a rabbit hole.

michaelsembwever avatar Apr 11 '19 05:04 michaelsembwever

I've finished my research. Considered 2 ways to accomplish that.

  1. Exclude JARs that contains duplicated classes from surefire classpath. Worked well initially, but stuck at this point. If I keep excluding JARs we get ClassNotFound exception. So, that's not the solution.
  2. Another solution would be to use custom ClassLoader as you started to do, but this is not going to work as well, as Jersey/Dropwizard escapes from the sandbox in some places calling ClassLoader.getSystemClassloader() like it happens here. Once this happens incorrect versions of classes are loaded and we get ClassCastException again.

aahutsal avatar Apr 11 '19 11:04 aahutsal

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work for 0.998 ETH (163.34 USD @ $163.67/ETH) has been submitted by:

  1. @gutsal-arsen

@michaelsembwever please take a look at the submitted work:

  • PR by @gutsal-arsen

gitcoinbot avatar Apr 11 '19 13:04 gitcoinbot

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


The funding of 0.998 ETH (163.5 USD @ $163.82/ETH) attached to this issue has been approved & issued to @gutsal-arsen.

gitcoinbot avatar Apr 11 '19 21:04 gitcoinbot