dust icon indicating copy to clipboard operation
dust copied to clipboard

Aborted (core dumped)

Open taqtiqa-mark opened this issue 4 years ago • 16 comments

$ /home/user/.cargo/bin/dust --version
Dust 0.7.0
$ /home/user/.cargo/bin/dust                                   
                                                                               
thread '<unknown>' has overflowed its stack                         
fatal runtime error: stack overflow                                        
Aborted (core dumped)

taqtiqa-mark avatar Nov 16 '21 08:11 taqtiqa-mark

ok, what were you running that on ?

bootandy avatar Nov 20 '21 12:11 bootandy

System: Ubuntu 18.04 Folder: ~/

taqtiqa-mark avatar Nov 20 '21 21:11 taqtiqa-mark

Does it happens with old versions of dust ?

if you 'cd' into one of the directories inside ~ do each of them work - if not which one fails?

Does du work ?

Do you have any suspicious directories that symlink to interesting places ?

Aprox how big is '~' ?

bootandy avatar Nov 28 '21 10:11 bootandy

Hi! I got exact same issue today. It reproduces kinda randomly and could be related to having a huge number of path components.

I was trying to dust my working directory, as usual. But this time, I had 500 directories one inside another I previously generated for tests (mkdir -p $(for i in {1..500}; do echo -n "qwe/"; done))

OS: RHEL 8.4

17:23:11 ~$ ls repro
build  qwe  src  toolchain

17:23:15 ~$ dust -rbd0 repro

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
17:23:16 ~$ dust -rbd0 repro

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
17:23:17 ~$ dust -rbd0 repro

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
17:23:17 ~$ dust -rbd0 repro

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
17:23:17 ~$ dust -rbd0 repro
 9.3G └── repro

17:23:18 ~$ for i in $(seq 1 100); do dust -rbd0 repro/ ; done &>repro.log
17:23:44 ~$ grep -c 'stack overflow' repro.log
74

Backtrace (collected on 0.8.1):

#0  0x00007ffff708437f in raise () from /lib64/libc.so.6
#1  0x00007ffff706edb5 in abort () from /lib64/libc.so.6
#2  0x00005555556ab637 in std::sys::unix::abort_internal () at library/std/src/sys/unix/mod.rs:259
#3  0x00005555556aaca0 in std::sys::unix::stack_overflow::imp::signal_handler () at library/std/src/sys/unix/stack_overflow.rs:109
#4  <signal handler called>
#5  0x00005555555b6170 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#6  0x00005555555966d6 in std::panicking::try ()
#7  0x00005555555b40e7 in rayon_core::registry::in_worker ()
#8  0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#9  0x00005555555966d6 in std::panicking::try ()
#10 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#11 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#12 0x00005555555966d6 in std::panicking::try ()
#13 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#14 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#15 0x00005555555966d6 in std::panicking::try ()
#16 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#17 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#18 0x0000555555597abc in <rayon::iter::par_bridge::IterBridge<Iter> as rayon::iter::ParallelIterator>::drive_unindexed ()
#19 0x00005555555af377 in dust::dir_walker::walk::h17975b3bfa2d5ea2 ()
#20 0x00005555555afb29 in dust::dir_walker::walk::_$u7b$$u7b$closure$u7d$$u7d$::hb881a645907fa431 ()
#21 0x00005555555ad4f1 in <rayon::iter::filter_map::FilterMapFolder<C,P> as rayon::iter::plumbing::Folder<T>>::consume ()
#22 0x0000555555596ca9 in <rayon::iter::par_bridge::IterParallelProducer<Iter> as rayon::iter::plumbing::UnindexedProducer>::fold_with ()
#23 0x00005555555b628b in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#24 0x00005555555966d6 in std::panicking::try ()
#25 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#26 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#27 0x00005555555966d6 in std::panicking::try ()
#28 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#29 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#30 0x00005555555966d6 in std::panicking::try ()
#31 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#32 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#33 0x00005555555966d6 in std::panicking::try ()
#34 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#35 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#36 0x00005555555966d6 in std::panicking::try ()
#37 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#38 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#39 0x00005555555966d6 in std::panicking::try ()
#40 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#41 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#42 0x0000555555597abc in <rayon::iter::par_bridge::IterBridge<Iter> as rayon::iter::ParallelIterator>::drive_unindexed ()
#43 0x00005555555af377 in dust::dir_walker::walk::h17975b3bfa2d5ea2 ()
#44 0x00005555555afb29 in dust::dir_walker::walk::_$u7b$$u7b$closure$u7d$$u7d$::hb881a645907fa431 ()
#45 0x00005555555ad4f1 in <rayon::iter::filter_map::FilterMapFolder<C,P> as rayon::iter::plumbing::Folder<T>>::consume ()
#46 0x0000555555596ca9 in <rayon::iter::par_bridge::IterParallelProducer<Iter> as rayon::iter::plumbing::UnindexedProducer>::fold_with ()
#47 0x00005555555b628b in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
...
repeats like that for 5000+ frames, mad recursion
...
#5863 0x00005555555966d6 in std::panicking::try ()
#5864 0x00005555555b40e7 in rayon_core::registry::in_worker ()
#5865 0x00005555555b6175 in rayon::iter::plumbing::bridge_unindexed_producer_consumer ()
#5866 0x0000555555596860 in std::panicking::try ()
#5867 0x000055555559b940 in _$LT$rayon_core..job..StackJob$LT$L$C$F$C$R$GT$$u20$as$u20$rayon_core..job..Job$GT$::execute::h7ae67eafe3591cf4 ()
#5868 0x000055555558d02e in rayon_core::registry::WorkerThread::wait_until_cold ()
#5869 0x0000555555631daf in rayon_core::registry::ThreadBuilder::run ()
#5870 0x0000555555633d61 in std::sys_common::backtrace::__rust_begin_short_backtrace ()
#5871 0x00005555556367ab in core::ops::function::FnOnce::call_once{{vtable.shim}} ()
#5872 0x00005555556ab223 in alloc::boxed::{impl#44}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/alloc/src/boxed.rs:1691
#5873 alloc::boxed::{impl#44}::call_once<(), alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>, alloc::alloc::Global> () at /rustc/f1edd0429582dd29cccacaf50fd134b05593bd9c/library/alloc/src/boxed.rs:1691
#5874 std::sys::unix::thread::{impl#2}::new::thread_start () at library/std/src/sys/unix/thread.rs:106
#5875 0x00007ffff79a015a in start_thread () from /lib64/libpthread.so.0
#5876 0x00007ffff7149dd3 in clone () from /lib64/libc.so.6

Does it happens with old versions of dust ?

Reproduces with 0.6.0 and above. dust 0.5.4 works without any issues.

if you 'cd' into one of the directories inside ~ do each of them work - if not which one fails?

Yup, all of them work individually (even the generated one) and together if I remove the generated directory:

17:26:14 ~$ for i in $(seq 1 100); do dust -rbd0 repro/build ; done &>repro.log
17:27:31 ~$ grep -c 'stack overflow' repro.log
0
17:27:38 ~$ for i in $(seq 1 100); do dust -rbd0 repro/src ; done &>repro.log
17:28:07 ~$ grep -c 'stack overflow' repro.log
0
17:28:11 ~$ for i in $(seq 1 100); do dust -rbd0 repro/qwe ; done &>repro.log
17:28:39 ~$ grep -c 'stack overflow' repro.log
0
17:28:54 ~$ for i in $(seq 1 100); do dust -rbd0 repro/toolchain ; done &>repro.log
17:29:02 ~$ grep -c 'stack overflow' repro.log
0
17:29:04 ~$ mv repro/qwe ~/
17:29:23 ~$ ls repro/
build  src  toolchain
17:29:26 ~$ for i in $(seq 1 100); do dust -rbd0 repro/ ; done &>repro.log
17:29:54 ~$ grep -c 'stack overflow' repro.log
0

Does du work ?

It does, every time:

17:32:29 ~$ ls repro
build  qwe  src  toolchain
17:32:45 ~$ for i in $(seq 1 100); do /bin/du -sh repro/ ; done &>repro.log
17:32:59 ~$ cat repro.log
9.4G    repro/
9.4G    repro/
9.4G    repro/
9.4G    repro/
...

Do you have any suspicious directories that symlink to interesting places ?

Nothing suspicious outside of the one I described.

Aprox how big is '~' ?

17:35:01 ~$ /bin/du -sh repro/*
1012M   repro/build
0       repro/qwe
3.5G    repro/src
181M    repro/toolchain
17:35:07 ~$ find repro | wc -l
173841
17:35:30 ~$ find repro/build | wc -l
29547
17:35:36 ~$ find repro/qwe | wc -l
500
17:35:40 ~$ find repro/src | wc -l
120892
17:35:43 ~$ find repro/toolchain | wc -l
4546
17:35:49 ~$ find repro -type f | wc -l
150791

BobIsOnFire avatar Aug 09 '22 14:08 BobIsOnFire

Thanks@Bobisonfire , for a very detailed and interesting bug report.

Sadly I can't reproduce the failure on Ubuntu, however the clue is given with the stackoverflow so I'll try experimenting with figuring out how to increase & decrease stack-size.

Interesting that it worked with 'dust 0.5.4 '.

bootandy avatar Aug 12 '22 12:08 bootandy

I've been able to reproduce it now - its something to do with how Rayon handles its number of threads and/or stack size. I'm investigating.

bootandy avatar Aug 13 '22 09:08 bootandy

FYI, I had the same issue in 0.8.1 (arch package ver) and was told the following:

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)

However, this no longer occurs in the latest version of the master branch. Thank you for the fix.

kyoheiu avatar Aug 17 '22 21:08 kyoheiu

Same issue, 0.8.6, macos m1, folder is 73G,

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow

kritma avatar Nov 12 '23 02:11 kritma

So this can happen if the stack size is too low.

Conversely if the stack size is too big, dust wont run on low powered machines. (eg small linode servers).

I'm surprised it is happening on a mac with a 73G folder.

If you are comfortable hacking dust yourself - The stack allocation is here: https://github.com/bootandy/dust/blob/master/src/main.rs#L229 if you increase it I'd be curious to know if that fixes your issue.

Perhaps I should let the user specify stack size in a later version.

bootandy avatar Nov 15 '23 23:11 bootandy

Can you just get rid of recursion? There may be significant speed increase.

kritma avatar Nov 16 '23 17:11 kritma

Did you tried single thread with async fs?

kritma avatar Nov 16 '23 17:11 kritma

Did you tried single thread with async fs?

I have not tried that no. That would probably be a significant re-write.

bootandy avatar Nov 18 '23 11:11 bootandy

Can you just get rid of recursion? There may be significant speed increase.

We've been through several iterations, this is as fast as we have been able to make. Introducing Rayon gave large speed increases.

By nature file walking tends to be recursive. Especially if you are running with lots of threads. I did have a version that assigned a thread to each subdirectory and ran without recursion but I couldn't allocate the threads well and the performance was poor.

bootandy avatar Nov 18 '23 11:11 bootandy

I'm afraid I have this issue. I'm on an M1 MacBook Pro with 16GB and I have the latest version of Dust installed :-)

matthewblott avatar Jan 02 '24 21:01 matthewblott

I'm returning the '-stack-size' parameter and providing better defaults with the next version.

bootandy avatar Jan 03 '24 21:01 bootandy

@matthewblott - I have pushed a new version of dust 0.9.0 - does it still have this issue - if yes can you try running dust with different stack sizes like this:

dust -S 1048576 dust -S 1073741824 // Increase the stack size if the error is still seen - thanks

bootandy avatar Jan 09 '24 23:01 bootandy

@bootandy That works a treat, thanks :-)

matthewblott avatar Jan 30 '24 11:01 matthewblott