jepsen
jepsen copied to clipboard
Problem with SSH loggers?
I have spun up the jepsen-vagrant environment and gotten past the issues noted in #39, now I get this error, "Auth fail". I'm able to 'ssh root@n1', and it'd be nice to see any more verbose messages available about this, it seems to be complaining that the logger for clj-ssh hasn't been setup properly:
-- snip --
vagrant@jepsen:/jepsen$ !lein lein with-profile +rabbitmq test jepsen.system.rabbitmq-test
lein test jepsen.system.rabbitmq-test SLF4J: The following loggers will not work becasue they were created SLF4J: during the default configuration phase of the underlying logging system. SLF4J: See also http://www.slf4j.org/codes.html#substituteLogger SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh
lein test :only jepsen.system.rabbitmq-test/rabbit-test
ERROR in (rabbit-test) (Session.java:512) Uncaught exception, not in assertion. expected: nil actual: com.jcraft.jsch.JSchException: Auth fail at com.jcraft.jsch.Session.connect (Session.java:512) com.jcraft.jsch.Session.connect (Session.java:183) clj_ssh.ssh$connect.invoke (ssh.clj:327) jepsen.control$session.invoke (control.clj:182) clojure.lang.AFn.applyToHelper (AFn.java:154) clojure.lang.AFn.applyTo (AFn.java:144) clojure.core$apply.invoke (core.clj:624) jepsen.core$fcatch$wrapper__4829.doInvoke (core.clj:39) clojure.lang.RestFn.invoke (RestFn.java:408) clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6463) clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910) clojure.lang.AFn.call (AFn.java:18) java.util.concurrent.FutureTask.run (FutureTask.java:266) java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617) java.lang.Thread.run (Thread.java:745)
Ran 1 tests containing 1 assertions. 0 failures, 1 errors. Tests failed. Error encountered performing task 'test' with profile(s): 'base,system,user,provided,dev,rabbitmq' Tests failed.
-- snip --
I'm happy to help continue troubleshooting this, and have even sent a PR to jepsen-vagrant to note some of the steps I took to get past hurdles, but I'm drawing a blank here so far.
Thanks in advance to anyone who has time to help with this!
Hi Justin,
The loggers' error is caused by issues in classloading I think: slf4j initializes itself by looking at classloaders content to find correct implementation to use and I believe some dependencies are not packaged in such a way as to let dependent code choose their own loggers. I remember reading some notes about this in project.clj
.
jsch uses its own logging system so does not depend on slf4j, BUT clj.ssh wraps jsch logger into a clojure logger...
Are you sure your tests are using user root
? I think by default (did not check recently) the user was ubuntu
.
Thanks, I expected something like that with the loggers, I would actually love to help fix that.
I was more interested in getting help on why my tests are failing to run, but once I'd typed the issue out, it seemed like, "well, without error messages from ssh, it's hard to say what's going wrong." :)
I do see that the default is still ubuntu, I changed it to root and still get an Auth fail.
Yeah, clojure logging has always been confusing for me. I think I might have fixed it once in a different project but forget how, haha.
I'm familiar with this sort of problem in python logging, which is modeled after log4j, loosely. What Closure adds to the picture I'm interested to know, so maybe this is an expedition I can go on.
I remember reading in another issue that @aphyr had manually re-added 'ubuntu' users, so I did that and this seems to work. I'm confused that when I set it as root it still didn't work, when I can, e.g. 'ssh root@n1'.
Thanks for both of your feedback here, I'm not just asking you to help for me, but I'm actually interested in making this easier to run after spending some weeks trying to myself. I really appreciate @abailly's vagrant env!
Hi Justin, Thanks. I developed it for the same reason than you did: Understand better jepsen and making it easier to run for quick (and not too dirty) tests. Next step would probably be dockerizing it... I have plans to use jepsen to test the system I am currently developing and the underlying database which is a bit exotic and not covered by current test suite. This would further the goal advocated by @aphyr of making jepsen a reference for DB and distributed system vendors, something they would use to advocate the kind of guarantee they can offer.
BTW, I am pretty familiar with logging in Java through SLF4J, which is what is used by clojure I think, and as I already said it is somewhat painful to setup especially when dependencies are not cooperating... This thing plays a lot with classloading and I sometimes had to do some dark magic to get it working properly. If you need help on this, feel free to ping...
Neat!
Maybe we need a #jepsen irc channel.
I'm also having these same errors. It doesn't help that the error messages from JSch are completely horrible. Modern versions of debian no longer support RSA by default but use ECDSA which isn't supported by JSch as far as I can tell. Are there any ways to turn on more detailed error messages? I tried enabling tracing but nothing further came out.
Hi Justin/All, I did make changes in 'src/jepsen/control.clj'
I still get the following error when I try to run either the elasticsearch or the rabbitmq test profiles in lein - vagrant@jepsen:/jepsen/jepsen$ !435 lein with-profile +rabbitmq test jepsen.system.rabbitmq-test
lein test jepsen.system.rabbitmq-test SLF4J: The following loggers will not work becasue they were created SLF4J: during the default configuration phase of the underlying logging system. SLF4J: See also http://www.slf4j.org/codes.html#substituteLogger SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh SLF4J: clj-ssh.ssh
lein test :only jepsen.system.rabbitmq-test/rabbit-test
ERROR in (rabbit-test) (Session.java:512) Uncaught exception, not in assertion. expected: nil actual: com.jcraft.jsch.JSchException: Auth fail at com.jcraft.jsch.Session.connect (Session.java:512) com.jcraft.jsch.Session.connect (Session.java:183) clj_ssh.ssh$connect.invoke (ssh.clj:327) jepsen.control$session.invoke (control.clj:183) clojure.lang.AFn.applyToHelper (AFn.java:154) clojure.lang.AFn.applyTo (AFn.java:144) clojure.core$apply.invoke (core.clj:624) jepsen.core$fcatch$wrapper__4829.doInvoke (core.clj:39) clojure.lang.RestFn.invoke (RestFn.java:408) clojure.core$pmap$fn__6328$fn__6329.invoke (core.clj:6463) clojure.core$binding_conveyor_fn$fn__4145.invoke (core.clj:1910) clojure.lang.AFn.call (AFn.java:18) java.util.concurrent.FutureTask.run (FutureTask.java:266) java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1142) java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:617) java.lang.Thread.run (Thread.java:745)
Ran 1 tests containing 1 assertions. 0 failures, 1 errors. Tests failed. Error encountered performing task 'test' with profile(s): 'base,system,user,provided,dev,rabbitmq' Tests failed.
I was hitting Auth fail too, and discovered that jsch is not using ssh keys, but trying root user with password. I've notes in this issue that describes what I observed and did to work around it.
This week against whatever the latest debian/jessie64 vagrant box is, I had no problems, except that I still wanted to fix the issue with logging in general.