dust
dust copied to clipboard
Aborted (core dumped)
$ /home/user/.cargo/bin/dust --version
Dust 0.7.0
$ /home/user/.cargo/bin/dust
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
ok, what were you running that on ?
System: Ubuntu 18.04 Folder: ~/
Does it happens with old versions of dust ?
if you 'cd' into one of the directories inside ~ do each of them work - if not which one fails?
Does du work ?
Do you have any suspicious directories that symlink to interesting places ?
Aprox how big is '~' ?
Hi! I got exact same issue today. It reproduces kinda randomly and could be related to having a huge number of path components.
I was trying to dust my working directory, as usual. But this time, I had 500 directories one inside another I previously generated for tests (mkdir -p $(for i in {1..500}; do echo -n "qwe/"; done))
OS: RHEL 8.4
17:23:11 ~$ ls repro
build qwe src toolchain
17:23:15 ~$ dust -rbd0 repro
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
17:23:16 ~$ dust -rbd0 repro
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
17:23:17 ~$ dust -rbd0 repro
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
17:23:17 ~$ dust -rbd0 repro
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
17:23:17 ~$ dust -rbd0 repro
9.3G └── repro
17:23:18 ~$ for i in $(seq 1 100); do dust -rbd0 repro/ ; done &>repro.log
17:23:44 ~$ grep -c 'stack overflow' repro.log
74
Backtrace (collected on 0.8.1):
#0 0x00007ffff708437f in raise () from /lib64/libc.so.6
#1 0x00007ffff706edb5 in abort () from /lib64/libc.so.6
#2 0x00005555556ab637 in std::sys::unix::abort_internal () at library/std/src/sys/unix/mod.rs:259
#3 0x00005555556aaca0 in std::sys::unix::stack_overflow::imp::signal_handler () at library/std/src/sys/unix/stack_overflow.rs:109
#4 <signal handler called>
#5 0x00005555555b6170 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#6 0x00005555555966d6 in std::panicking::try ()
#7 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#8 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#9 0x00005555555966d6 in std::panicking::try ()
#10 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#11 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#12 0x00005555555966d6 in std::panicking::try ()
#13 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#14 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#15 0x00005555555966d6 in std::panicking::try ()
#16 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#17 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#18 0x0000555555597abc in <rayon::iter::par_bridge::IterBridge<Iter> as rayon::iter::ParallelIterator>::drive_unindexed ()
#19 0x00005555555af377 in dust::dir_walker::walk::h17975b3bfa2d5ea2 ()
#20 0x00005555555afb29 in dust::dir_walker::walk::_$u7b$$u7b$closure$u7d$$u7d$::hb881a645907fa431 ()
#21 0x00005555555ad4f1 in <rayon::iter::filter_map::FilterMapFolder<C,P> as rayon::iter::plumbing::Folder<T>>::consume ()
#22 0x0000555555596ca9 in <rayon::iter::par_bridge::IterParallelProducer<Iter> as rayon::iter::plumbing::UnindexedProducer>::fold_with ()
#23 0x00005555555b628b in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#24 0x00005555555966d6 in std::panicking::try ()
#25 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#26 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#27 0x00005555555966d6 in std::panicking::try ()
#28 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#29 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#30 0x00005555555966d6 in std::panicking::try ()
#31 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#32 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#33 0x00005555555966d6 in std::panicking::try ()
#34 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#35 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#36 0x00005555555966d6 in std::panicking::try ()
#37 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#38 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#39 0x00005555555966d6 in std::panicking::try ()
#40 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#41 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#42 0x0000555555597abc in <rayon::iter::par_bridge::IterBridge<Iter> as rayon::iter::ParallelIterator>::drive_unindexed ()
#43 0x00005555555af377 in dust::dir_walker::walk::h17975b3bfa2d5ea2 ()
#44 0x00005555555afb29 in dust::dir_walker::walk::_$u7b$$u7b$closure$u7d$$u7d$::hb881a645907fa431 ()
#45 0x00005555555ad4f1 in <rayon::iter::filter_map::FilterMapFolder<C,P> as rayon::iter::plumbing::Folder<T>>::consume ()
#46 0x0000555555596ca9 in <rayon::iter::par_bridge::IterParallelProducer<Iter> as rayon::iter::plumbing::UnindexedProducer>::fold_with ()
#47 0x00005555555b628b in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
...
repeats like that for 5000+ frames, mad recursion
...
#5863 0x00005555555966d6 in std::panicking::try ()
#5864 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#5865 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#5866 0x0000555555596860 in std::panicking::try ()
#5867 0x000055555559b940 in _$LT$rayon_core..job..StackJob$LT$L$C$F$C$R$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::h7ae67eafe3591cf4 ()
#5868 0x000055555558d02e in rayon_core::registry::WorkerThread::wait_until_cold ()
#5869 0x0000555555631daf in rayon_core::registry::ThreadBuilder::run ()
#5870 0x0000555555633d61 in std::sys_common::backtrace::__rust_begin_short_backtrace ()
#5871 0x00005555556367ab in core::ops::function::FnOnce::call_once{{vtable.shim}} ()
#5872 0x00005555556ab223 in alloc::boxed::{impl#44}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/alloc/src/boxed.rs:1691
#5873 alloc::boxed::{impl#44}::call_once<(), alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>, alloc::alloc::Global> () at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/alloc/src/boxed.rs:1691
#5874 std::sys::unix::thread::{impl#2}::new::thread_start () at library/std/src/sys/unix/thread.rs:106
#5875 0x00007ffff79a015a in start_thread () from /lib64/libpthread.so.0
#5876 0x00007ffff7149dd3 in clone () from /lib64/libc.so.6
Does it happens with old versions of dust ?
Reproduces with 0.6.0 and above. dust 0.5.4 works without any issues.
if you 'cd' into one of the directories inside ~ do each of them work - if not which one fails?
Yup, all of them work individually (even the generated one) and together if I remove the generated directory:
17:26:14 ~$ for i in $(seq 1 100); do dust -rbd0 repro/build ; done &>repro.log
17:27:31 ~$ grep -c 'stack overflow' repro.log
0
17:27:38 ~$ for i in $(seq 1 100); do dust -rbd0 repro/src ; done &>repro.log
17:28:07 ~$ grep -c 'stack overflow' repro.log
0
17:28:11 ~$ for i in $(seq 1 100); do dust -rbd0 repro/qwe ; done &>repro.log
17:28:39 ~$ grep -c 'stack overflow' repro.log
0
17:28:54 ~$ for i in $(seq 1 100); do dust -rbd0 repro/toolchain ; done &>repro.log
17:29:02 ~$ grep -c 'stack overflow' repro.log
0
17:29:04 ~$ mv repro/qwe ~/
17:29:23 ~$ ls repro/
build src toolchain
17:29:26 ~$ for i in $(seq 1 100); do dust -rbd0 repro/ ; done &>repro.log
17:29:54 ~$ grep -c 'stack overflow' repro.log
0
Does du work ?
It does, every time:
17:32:29 ~$ ls repro
build qwe src toolchain
17:32:45 ~$ for i in $(seq 1 100); do /bin/du -sh repro/ ; done &>repro.log
17:32:59 ~$ cat repro.log
9.4G repro/
9.4G repro/
9.4G repro/
9.4G repro/
...
Do you have any suspicious directories that symlink to interesting places ?
Nothing suspicious outside of the one I described.
Aprox how big is '~' ?
17:35:01 ~$ /bin/du -sh repro/*
1012M repro/build
0 repro/qwe
3.5G repro/src
181M repro/toolchain
17:35:07 ~$ find repro | wc -l
173841
17:35:30 ~$ find repro/build | wc -l
29547
17:35:36 ~$ find repro/qwe | wc -l
500
17:35:40 ~$ find repro/src | wc -l
120892
17:35:43 ~$ find repro/toolchain | wc -l
4546
17:35:49 ~$ find repro -type f | wc -l
150791
Thanks@Bobisonfire , for a very detailed and interesting bug report.
Sadly I can't reproduce the failure on Ubuntu, however the clue is given with the stackoverflow so I'll try experimenting with figuring out how to increase & decrease stack-size.
Interesting that it worked with 'dust 0.5.4 '.
I've been able to reproduce it now - its something to do with how Rayon handles its number of threads and/or stack size. I'm investigating.
FYI, I had the same issue in 0.8.1 (arch package ver) and was told the following:
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
However, this no longer occurs in the latest version of the master branch. Thank you for the fix.
Same issue, 0.8.6, macos m1, folder is 73G,
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
So this can happen if the stack size is too low.
Conversely if the stack size is too big, dust wont run on low powered machines. (eg small linode servers).
I'm surprised it is happening on a mac with a 73G folder.
If you are comfortable hacking dust yourself - The stack allocation is here: https://github.com/bootandy/dust/blob/master/src/main.rs#L229 if you increase it I'd be curious to know if that fixes your issue.
Perhaps I should let the user specify stack size in a later version.
Can you just get rid of recursion? There may be significant speed increase.
Did you tried single thread with async fs?
Did you tried single thread with async fs?
I have not tried that no. That would probably be a significant re-write.
Can you just get rid of recursion? There may be significant speed increase.
We've been through several iterations, this is as fast as we have been able to make. Introducing Rayon gave large speed increases.
By nature file walking tends to be recursive. Especially if you are running with lots of threads. I did have a version that assigned a thread to each subdirectory and ran without recursion but I couldn't allocate the threads well and the performance was poor.
I'm afraid I have this issue. I'm on an M1 MacBook Pro with 16GB and I have the latest version of Dust installed :-)
I'm returning the '-stack-size' parameter and providing better defaults with the next version.
@matthewblott - I have pushed a new version of dust 0.9.0 - does it still have this issue - if yes can you try running dust with different stack sizes like this:
dust -S 1048576 dust -S 1073741824 // Increase the stack size if the error is still seen - thanks
@bootandy That works a treat, thanks :-)