sumo icon indicating copy to clipboard operation
sumo copied to clipboard

Segmentation Fault When Lowering device.rerouting.period and device.rerouting.adaptation-interval

Open henrigrossmann opened this issue 1 year ago • 1 comments

Description: I am encountering a reproducible segmentation fault when running simulations in SUMO Version v1_20_0+1629-53d0e27b5f3. The issue arises when setting low values for the <device.rerouting.period value="..."/> and <device.rerouting.adaptation-interval value="..."/> parameters (in this example 45s) – even on a single thread.

Steps to Reproduce: Run the provided simulation example. The simulation runs until ~06:45 am and then crashes consistently with a segmentation fault. Observations:

  • The issue does not (or only at a late time step) occur when setting <device.rerouting.adaptation-steps value="0" />.
  • Increasing the values for <device.rerouting.period/> and <device.rerouting.adaptation-interval/> makes the segmentation fault less likely to occur.
  • The issue is more likely to occur when running the simulation on multiple threads.

Expected Behavior: The simulation should run to completion without a segmentation fault, regardless of the values set for <device.rerouting.period/> and <device.rerouting.adaptation-interval/>.

Additional Information: -SUMO Version: v1_20_0+1629-53d0e27b5f3 -Run the simulation by simply using the command: sumo -c ~/error_example/input/sumo.sumocfg

Attachments: error_example.zip

Thank you for looking into this issue.

henrigrossmann avatar Aug 28 '24 10:08 henrigrossmann

I couldn't replicate the crash yet:

  • v1_20_0-1791-g53f26980760, linux running until 17:00
  • v1_20_0+1629-53d0e27b5f3, linux running until 8:45 (debug build)
  • v1_20_0+1629-53d0e27b5f3, linux running until 8:00 (release build)
  • v1_20_0-1791-g53f26980760, windows running until 8:00 (release build)

What platform did you experience the crash on? Could you replicate the crash on different machines? Does it always happen in the same simulation step? Could you try running with the debug build and sending a stack trace?

namdre avatar Aug 30 '24 06:08 namdre

I have the same issue.

  • my platform is MacOS Sonoma 14.5
  • always happen at the same simulation step
  • I can try to run with debug build and stack trace, if I remember how to do it properly :)

ambuehll avatar Aug 30 '24 06:08 ambuehll

I finally caught the crash at 18:00 and trapped it in a debugger.

namdre avatar Aug 30 '24 08:08 namdre

  • I'm running it on MacOS Sonoma 14.6.1
  • It always happens at the same simulation step at time 24344 or 06:45:44, so earlier than the crash that you observe. @ambuehll observes the crash also at 6:45:44.

henrigrossmann avatar Aug 30 '24 08:08 henrigrossmann

problem introduced via b2b8eec

namdre avatar Aug 30 '24 08:08 namdre

Hi Jakob Thank you for the quick fix! Unfortunately, I still get segmentation faults (but most likely for a different reason, because the example I provided runs now without any issues.) Here's an example that crashes at second 59667 (reproducible) when running on seed 50. When running it on seed 51 it works. SUMO Version: v1_20_0+1821-fef78581e33 Thank you for looking into this. example_fault.zip

henrigrossmann avatar Sep 02 '24 07:09 henrigrossmann

I'll look into it. Can you get it to crash without --routing-threads?

namdre avatar Sep 02 '24 10:09 namdre

When running your scenario and config with the debug version of SUMO it completed without crashing for me.

namdre avatar Sep 02 '24 13:09 namdre

Ah yes, it only happens when using multiple (5) threads – the segmentation fault comes from there. Thanks for the hint.

henrigrossmann avatar Sep 02 '24 13:09 henrigrossmann

I also ran with 5 threads and couldn't see it crash. In what percentage of multi-thread runs does the crash show up for you? (when keeping the seed fixed).

My guess would still be that the crash is related to parallel routing but it doesn't have to be.

namdre avatar Sep 02 '24 13:09 namdre

I ran it 4 times and it always crashed at the same time (59667). So 100 % crash rate when running on 5 threads.

I tried to run it on 3 threads (same seed) -> works fine.

henrigrossmann avatar Sep 02 '24 13:09 henrigrossmann

I ran it on linux and windows with 5 threads (same seed) and didn't see the crash (tested release and debug version). There must be some other aspect of our setup that causes the behavioral differences (which we already noted for the other crash scenario).

namdre avatar Sep 02 '24 16:09 namdre

The error could not be reproduced on a different MacBook, so it seems likely that the issue is related to my setup, although I haven't been able to identify the cause -> You can close the issue. Thank you again for your help.

henrigrossmann avatar Sep 09 '24 15:09 henrigrossmann

If you figure out something else, please don't hesitate to post morel

namdre avatar Sep 09 '24 20:09 namdre