nrepl icon indicating copy to clipboard operation
nrepl copied to clipboard

1 minute delay on exit

Open metametadata opened this issue 3 years ago • 9 comments

Expected behavior

There's no delay on exit.

Actual behavior

App waits for 1 minute before exiting.

Steps to reproduce the problem

Require nrepl.server ns and exit the app. Example for version 0.8.0:

$ time clj -Sdeps '{:deps {nrepl {:mvn/version "0.8.0"}}}' -e "(do (require 'nrepl.server) (prn :hi))"
:hi
clj -Sdeps '{:deps {nrepl {:mvn/version "0.8.0"}}}' -e   5.89s user 0.39s system 10% cpu 1:01.57 total

Other

The issue is not reproducible in 0.7.0:

$ time clj -Sdeps '{:deps {nrepl {:mvn/version "0.7.0"}}}' -e "(do (require 'nrepl.server) (prn :hi))"
:hi
clj -Sdeps '{:deps {nrepl {:mvn/version "0.7.0"}}}' -e   5.06s user 0.32s system 344% cpu 1.563 total

Environment & Version information

Clojure version

1.10.1

Java version

11.0.9

Operating system

macOS

metametadata avatar Dec 15 '20 17:12 metametadata

Is this still slow with the latest beta?

bbatsov avatar Jul 29 '21 05:07 bbatsov

Yes.

$ time clj -Sdeps '{:deps {nrepl {:mvn/version "0.9.0-beta1"}}}' -e "(do (require 'nrepl.server) (prn :hi))"
:hi
clj -Sdeps '{:deps {nrepl {:mvn/version "0.9.0-beta1"}}}' -e   5.38s user 0.40s system 9% cpu 1:01.74 total

metametadata avatar Jul 29 '21 10:07 metametadata

Hmm, I can't put my finger on which particular change in 0.8 caused this regression. We might have to bisect the code to figure this out, or profile that require, as I assume it's triggering something that takes a while now.

@shen-tian any ideas what might be causing this?

bbatsov avatar Aug 18 '21 08:08 bbatsov

Sorry I've been awol. I just had a look through changelog, and nothing jumps out at me. The exactly 1 minute part is also curious. The only thing close I can think of is https://github.com/nrepl/nrepl/pull/163 <- but that's before 0.8.

shen-tian avatar Aug 20 '21 08:08 shen-tian

1 minute delay looks like a problem related to agent pool. Maybe future, pmap or agent-send were added after v0.7.0. See https://clojure.atlassian.net/browse/CLJ-124.

If it's true and nREPL cannot be fixed then all the consumers may need to add shutdown-agents calls as recommended in https://clojure.org/guides/faq#agent_shutdown:

When a program like this finishes, there will be a 60 second pause before exit. To fix this problem, call shutdown-agents as the program exits.

The discussion about a similar problem in clojure.java.shell/sh: https://clojureverse.org/t/why-doesnt-my-program-exit/3754.

metametadata avatar Aug 20 '21 11:08 metametadata

I'm not sure the repro makes a lot of sense on its own. Namely, for

$ time clj -Sdeps '{:deps {nrepl {:mvn/version "0.9.0-beta1"}}}' -e "(do (require 'nrepl.server) (prn :hi))"

It is perfectly natural that it takes 60s. It's not nrepl's job to call shutdown-agents? It seems a decision better suited for apps, nREPL clients or end users. How could nREPL know when the JVM should stop?

The same can be said of arbitrary libs that use future.

I'd suggest that a custom Ctrl-C handler makes the shutdown-agents itself if deemed adequate.

Otherwise, most likely a repro would have to reflect a realistic use case.

If perhaps your app lacked a (shutdown-agents), maybe the 'good' pre 0.8 behavior was more of a coincidence, rather than a contractual behavior.

vemv avatar Oct 29 '21 07:10 vemv

Perhaps we should document this somewhere (e.g. in a Troubleshooting section)?

bbatsov avatar Oct 29 '21 07:10 bbatsov

Considering how often people seem to forget about this pitfall: it would be nicer to fix the lib than consumers. But it's up to you.

metametadata avatar Oct 29 '21 09:10 metametadata

The fix seems plausibly dangerous, and importantly the issue is 'present' in arbitrarily many libs (i.e. any lib using future somewhere)

Perhaps we should document this somewhere (e.g. in a Troubleshooting section)?

(sgtm)

vemv avatar Oct 29 '21 09:10 vemv