unison icon indicating copy to clipboard operation
unison copied to clipboard

unison is ungraceful when running out of memory

Open gdt opened this issue 4 years ago • 8 comments

This is sort of a meta ticket: When unison runs out of memory, the user gets an uncaught exception and backtrace. This leads to many tickets. It would be nice to somehow catch these and give a nicer error message. It's hard to say what should happen but I basically view it as a bug to expose a stack trace in the implementation language.

This certainly applies to both regular dynamic allocation, and perhaps to stack space.

Ideally the user would understand what lead to memory exhaustion, perhaps with a printout of the number of files being considered.

Please do not add comments about specific instances of this problem. If you have a situation where unison should not run out memory but does anyway, please open an issue, if you have analyzed the memory/items relationship.

gdt avatar Sep 15 '20 13:09 gdt

I am using the Windows Binary Unison 2.48.3 (compiled with OCaml 4.01.0, incompatible with OCaml 4.02 builds)

My roots are bigger than huge, 633,084 Files, 2,257 Folders.

I got it to work by doing each folder individually using forfiles to launch unison separately for each directory.

powertoaster avatar Sep 15 '20 16:09 powertoaster

Today I ran into a (probably) related issue, so I add info here:

(a) Input is a large /opt hierarchy with 549656 files and 37384 directories (most of these in four full installations of “TeX Live” 2018–2021) and this profile. (b) Running unison[-gtk] 2.48.4 on Kubuntu 20.04 LTS results in a “Uncaught exception: Stack overflow” message for the invocation unison -times opt.prf. Screenshot_20210706_152940 (c) Running the same unison 2.48.4 without the -times option succeeds. (d) Runnig unison-gtk2 2.51.4 (ocaml 4.12.0) with the -times option seems to run through, but at the end unison freezes Screenshot_20210706_153209 and when I try to exit the GUI I get a system message Screenshot_20210706_152631 (e) After installing unison 2.51.4 manually in /usr/local/bin, the original unison 2.48.4 picks up the FS monitor and reports Screenshot_20210706_153132

ascherer avatar Jul 06 '21 13:07 ascherer

Thanks. It would be interesting to see what happens with 2.51.4 with 2.48 completely purged from your system. I'm not claiming or even suggesting this is fixed in 2.51.4, but I am not aware of anyone working on unison being willing to think about 2.48 at all. (The last error looks like it could be a mixed version problem.)

gdt avatar Jul 06 '21 16:07 gdt

(d) Runnig unison-gtk2 2.51.4 (ocaml 4.12.0) with the -times option seems to run through, but at the end unison freezes

I have two follow-up questions on this one.

First, would you be able to get a bracktrace of the frozen process? For example, look up the PID of the process (by ps or try pidof unison-gtk2) and then execute something like gdb -batch -ex bt -p <the PID> if you have gdb installed.

This is getting a bit off-topic for this ticket but for me it is fine if you post the trace in the comments here.

Second, does the freeze happen also without the -times option?

tleedjarv avatar Jul 06 '21 17:07 tleedjarv

As my macOS box has the latest unison 2.51.4 (ocaml 4.12.0) as well (via brew install), I purged the two Kubuntu packages unison and unison-gtk and installed the three programs from the recent assets in usr/local/bin.

This results in point (d) above.

Here's the backtrace as requested:

[New LWP 2362]
[New LWP 2364]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f1bea1429e7 in g_list_nth () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#0  0x00007f1bea1429e7 in g_list_nth () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#1  0x00007f1bea641def in gtk_clist_set_text () from /lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0
#2  0x000055f6def90c65 in ml_gtk_clist_set_text ()
#3  0x000055f6deea5d7f in camlGtkList__set_cell_inner_1485 ()
#4  0x000055f6dedf8efa in camlUigtk2__displayMain_4765 ()
#5  0x000055f6dedfa14f in camlUigtk2__detectUpdatesAndReconcile_4868 ()
#6  0x000055f6dedcec11 in camlUigtk2__getLock_925 ()
#7  0x000055f6dedfcfb1 in camlUigtk2__detectCmd_5101 ()
#8  0x000055f6dedff47c in camlUigtk2__start_5250 ()
#9  0x000055f6dedff669 in camlUigtk2__start_5267 ()
#10 0x000055f6dee014d1 in camlMain__Body_417 ()
#11 0x000055f6dedc9773 in camlLinkgtk2__entry ()
#12 0x000055f6dedc2109 in caml_program ()
#13 0x000055f6defc5a54 in caml_start_program ()
#14 0x000055f6defc5dc4 in caml_startup_common ()
#15 0x000055f6defc5e0b in caml_startup ()
#16 0x000055f6dedc143c in main ()
[Inferior 1 (process 2356) detached]

ascherer avatar Jul 06 '21 17:07 ascherer

Amendment: While I was collecting the backtrace, unison finally returned! It (correctly) lists all file timestamps as different between my actual harddrive and the backup medium. (I never used -times before).

ascherer avatar Jul 06 '21 17:07 ascherer

Thank you for the information. The backtrace and the fact that you have never used -times before confirm that the GUI is just very-very slow with such a huge number of updates.

There is actually already a PR for fixing this GUI slowness: #557 (if you'd like to try it, you can also download a binary from here https://github.com/bcpierce00/unison/actions/runs/970981503).

tleedjarv avatar Jul 06 '21 18:07 tleedjarv