Aleph/manifold thread occasionally hangs when dereferencing http requests
We're are using Aleph 0.4.6 and we ran several times into the issue where threads get stuck when dereferencing http requests.
Today, it occurred in these 2 functions:
(defn intervention-details-select [{:keys [leo-url]} user intervention-id]
(->>
(-> @(http/get (str leo-url "/sqa/interventions/"
intervention-id "/formValue")
{:query-params
{:form "INTERVENTIONS/Intervention_Details"
:user user
:mode "list"}})
:body
bs/to-string
(json/decode keyword)
(try (catch com.fasterxml.jackson.core.JsonParseException e
(throw
(ex-info "Resource not found"
{::anomalies/category ::anomalies/not-found
:error/parameters {:parameter :object}}))))
(group-intervention-details intervention-id))
(mapcat tabelize-intervention-detail-item)))
(defn leo-xml-call
[xml-body xml-namespaces & [{:keys [leo-url api-extra-attrs] :as options}]]
(let [req (xml-emit-string (leo-xml-request xml-body xml-namespaces options))
resp (try (-> @(http/post (str leo-url "/request")
{:headers {"Content-Type" "text/plain; charset=utf-8"}
:query-params {:RC "csIntrfc"
:AD "KEY:/cf"}
:body req}))
(catch Exception e e))
resp-string (when (:body resp) (bs/to-string (:body resp)))
resp-body (when resp-string (xml-parse resp-string))
resp-body))
And here are 2 consecutive thread dumps, at about 10 minutes interval, that show threads stuck while dereferencing these queries.
First thread dump:
--
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: "manifold-pool-2-556" #1026 daemon prio=5 os_prio=0 tid=0x00007f58b4002000 nid=0x157 waiting on condition [0x00007f5893cb3000]
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: java.lang.Thread.State: WAITING (parking)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sun.misc.Unsafe.park(Native Method)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: - parking to wait for <0x00000000e19330d0> (a java.util.concurrent.CountDownLatch$Sync)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred.deref(deferred.clj:441)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invokeStatic(core.clj:2312)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invoke(core.clj:2298)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select$fn__27334.invoke(leo.clj:241)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select.invokeStatic(leo.clj:242)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select.invoke(leo.clj:231)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.bl$intervention_details_select.invokeStatic(bl.clj:668)
--
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: "manifold-pool-2-178" #367 daemon prio=5 os_prio=0 tid=0x00007f58dc001000 nid=0x6b5c waiting on condition [0x00007f5894cc1000]
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: java.lang.Thread.State: WAITING (parking)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sun.misc.Unsafe.park(Native Method)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: - parking to wait for <0x00000000a9f28ae8> (a java.util.concurrent.CountDownLatch$Sync)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred.deref(deferred.clj:441)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invokeStatic(core.clj:2312)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invoke(core.clj:2298)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call$fn__27373.invoke(leo.clj:347)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call.invokeStatic(leo.clj:347)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call.doInvoke(leo.clj:338)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.lang.RestFn.invoke(RestFn.java:442)
--
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: "manifold-pool-2-177" #364 daemon prio=5 os_prio=0 tid=0x00007f58d8001800 nid=0x6b56 waiting on condition [0x00007f58952c5000]
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: java.lang.Thread.State: WAITING (parking)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sun.misc.Unsafe.park(Native Method)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: - parking to wait for <0x00000000a9f2bf48> (a java.util.concurrent.CountDownLatch$Sync)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred.deref(deferred.clj:441)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invokeStatic(core.clj:2312)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invoke(core.clj:2298)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select$fn__27334.invoke(leo.clj:241)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select.invokeStatic(leo.clj:242)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select.invoke(leo.clj:231)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_set.invokeStatic(leo.clj:252)
--
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: "manifold-pool-2-176" #363 daemon prio=5 os_prio=0 tid=0x00007f58c0003800 nid=0x6b53 waiting on condition [0x00007f5894fc2000]
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: java.lang.Thread.State: WAITING (parking)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sun.misc.Unsafe.park(Native Method)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: - parking to wait for <0x00000000a9f2f598> (a java.util.concurrent.CountDownLatch$Sync)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred.deref(deferred.clj:441)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invokeStatic(core.clj:2312)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invoke(core.clj:2298)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call$fn__27373.invoke(leo.clj:347)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call.invokeStatic(leo.clj:347)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call.doInvoke(leo.clj:338)
Jan 13 15:56:54 sp-sqa01 SQA_alpha[24105]: at clojure.lang.RestFn.invoke(RestFn.java:442)
Second thread dump, about 10 minutes later:
--
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: "manifold-pool-2-556" #1026 daemon prio=5 os_prio=0 tid=0x00007f58b4002000 nid=0x157 waiting on condition [0x00007f5893cb3000]
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: java.lang.Thread.State: WAITING (parking)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sun.misc.Unsafe.park(Native Method)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: - parking to wait for <0x00000000e6f1b7c0> (a java.util.concurrent.CountDownLatch$Sync)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred.deref(deferred.clj:441)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invokeStatic(core.clj:2312)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invoke(core.clj:2298)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select$fn__27334.invoke(leo.clj:241)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select.invokeStatic(leo.clj:242)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select.invoke(leo.clj:231)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.bl$intervention_details_select.invokeStatic(bl.clj:668)
--
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: "manifold-pool-2-178" #367 daemon prio=5 os_prio=0 tid=0x00007f58dc001000 nid=0x6b5c waiting on condition [0x00007f5894cc1000]
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: java.lang.Thread.State: WAITING (parking)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sun.misc.Unsafe.park(Native Method)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: - parking to wait for <0x00000000a9f28ae8> (a java.util.concurrent.CountDownLatch$Sync)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred.deref(deferred.clj:441)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invokeStatic(core.clj:2312)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invoke(core.clj:2298)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call$fn__27373.invoke(leo.clj:347)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call.invokeStatic(leo.clj:347)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call.doInvoke(leo.clj:338)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.lang.RestFn.invoke(RestFn.java:442)
--
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: "manifold-pool-2-177" #364 daemon prio=5 os_prio=0 tid=0x00007f58d8001800 nid=0x6b56 waiting on condition [0x00007f58952c5000]
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: java.lang.Thread.State: WAITING (parking)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sun.misc.Unsafe.park(Native Method)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: - parking to wait for <0x00000000a9f2bf48> (a java.util.concurrent.CountDownLatch$Sync)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred.deref(deferred.clj:441)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invokeStatic(core.clj:2312)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invoke(core.clj:2298)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select$fn__27334.invoke(leo.clj:241)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select.invokeStatic(leo.clj:242)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_select.invoke(leo.clj:231)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$intervention_details_set.invokeStatic(leo.clj:252)
--
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: "manifold-pool-2-176" #363 daemon prio=5 os_prio=0 tid=0x00007f58c0003800 nid=0x6b53 waiting on condition [0x00007f5894fc2000]
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: java.lang.Thread.State: WAITING (parking)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sun.misc.Unsafe.park(Native Method)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: - parking to wait for <0x00000000a9f2f598> (a java.util.concurrent.CountDownLatch$Sync)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred.deref(deferred.clj:441)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invokeStatic(core.clj:2312)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.core$deref.invoke(core.clj:2298)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call$fn__27373.invoke(leo.clj:347)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call.invokeStatic(leo.clj:347)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at sqa.leo$leo_xml_call.doInvoke(leo.clj:338)
Jan 13 16:08:36 sp-sqa01 SQA_alpha[24105]: at clojure.lang.RestFn.invoke(RestFn.java:442)
Any suggestion on how to troubleshoot this issue is welcome.
Thanks a lot in advance!
While digging more into the logs today, I found we had a couple of FileNotFoundExceptions within a manifold.deferred.Listener.onSuccess callback, which occurred before we noticed some calls where hanging.
Could that have prevented some resources from being properly released?
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: java.io.FileNotFoundException: ... (Is a directory)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at java.io.FileInputStream.open0(Native Method)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at java.io.FileInputStream.open(FileInputStream.java:195)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at java.io.FileInputStream.<init>(FileInputStream.java:138)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at java.io.FileInputStream.<init>(FileInputStream.java:93)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at java.io.FileReader.<init>(FileReader.java:58)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at sqa.render$get_template.invokeStatic(render.clj:62)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at sqa.render$get_template.invoke(render.clj:54)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at sqa.api$fn__35673.invokeStatic(api.clj:1357)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at sqa.api$fn__35673.invoke(api.clj:1354)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at sqa.api$wrap_error_handler$fn__35690.invoke(api.clj:1431)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at yada.methods$apply_response_fn.invokeStatic(methods.clj:31)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at yada.methods$apply_response_fn.invoke(methods.clj:30)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at yada.methods$fn__28312$fn__28317.invoke(methods.clj:174)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.deferred$fn__10239$chain___10260.invoke(deferred.clj:859)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.lang.AFn.applyToHelper(AFn.java:165)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.lang.RestFn.applyTo(RestFn.java:132)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.core$apply.invokeStatic(core.clj:661)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.core$apply.invoke(core.clj:652)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.deferred$fn__10239$chain___10260$fn__10264.invoke(deferred.clj:888)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Listener.onSuccess(deferred.clj:219)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred$fn__10046.invoke(deferred.clj:381)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.lang.AFn.run(AFn.java:22)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at io.aleph.dirigiste.Executor$Worker$1.run(Executor.java:62)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.executor$thread_factory$reify__9563$f__9564.invoke(executor.clj:44)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.lang.AFn.run(AFn.java:22)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at java.lang.Thread.run(Thread.java:748)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: 15:41:25.824 [manifold-pool-2-501] ERROR yada.handler - Internal Error 500
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: clojure.lang.ExceptionInfo: ... (Is a directory)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.core$ex_info.invokeStatic(core.clj:4739)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.core$ex_info.invoke(core.clj:4739)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at sqa.api$bl_error_handler.invokeStatic(api.clj:112)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at sqa.api$bl_error_handler.invoke(api.clj:97)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at sqa.api$wrap_error_handler$fn__35690.invoke(api.clj:1432)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at yada.methods$apply_response_fn.invokeStatic(methods.clj:31)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at yada.methods$apply_response_fn.invoke(methods.clj:30)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at yada.methods$fn__28312$fn__28317.invoke(methods.clj:174)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.deferred$fn__10239$chain___10260.invoke(deferred.clj:859)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.lang.AFn.applyToHelper(AFn.java:165)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.lang.RestFn.applyTo(RestFn.java:132)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.core$apply.invokeStatic(core.clj:661)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.core$apply.invoke(core.clj:652)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.deferred$fn__10239$chain___10260$fn__10264.invoke(deferred.clj:888)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Listener.onSuccess(deferred.clj:219)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.deferred.Deferred$fn__10046.invoke(deferred.clj:381)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.lang.AFn.run(AFn.java:22)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at io.aleph.dirigiste.Executor$Worker$1.run(Executor.java:62)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at manifold.executor$thread_factory$reify__9563$f__9564.invoke(executor.clj:44)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at clojure.lang.AFn.run(AFn.java:22)
Jan 13 15:41:25 sp-sqa01 SQA_alpha[24105]: at java.lang.Thread.run(Thread.java:748)
We downgraded back to 0.4.4 11 days ago there are no more threads hanging, which suggests a regression on 0.4.6.
Let me know when a new stable release is available and I'll give it a test run.
There it hanged again today with 0.4.4. We'll replace the http client side that uses Aleph with another lib.
Have the same issue. Timeouts don't help it either.
We'll replace the http client side that uses Aleph with another lib.
Which lib did you choose in the end?
We use clj-http. It was a drop-in replacement and we had no more issues since then.
Reading through the old logs here, it appears that the FileNotFoundException may have been killing it. The stacktraces include the listener calls, which would not happen if an executor was set, so they're running on the same thread.
In that case, the issue becomes the invisibility of exceptions. We've done some work over the last year in improving this (#592), so invisibility might not be an issue now.
I'm tempted to close this unless anyone wants to keep it open for some reason.