morty icon indicating copy to clipboard operation
morty copied to clipboard

Morty stack overflow

Open flaviens opened this issue 3 years ago • 14 comments

Hi there! Me again :innocent:

Morty stack overflows on some CPU code.

Output

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
[1]    30715 IOT instruction (core dumped)  /home/user/.cargo/bin/morty -f Bender.sources

How to reproduce

Tried on Ubuntu 22.04 with morty 0.8.0.

git clone https://github.com/flaviens/a2o.git
cd a2o
git checkout reprod_morty_stkovf
morty -f Bender.sources -DNCLK_WIDTH=1

Thanks!

flaviens avatar Oct 26 '22 09:10 flaviens

I have encountered stack overflows with morty before, but have not had the chance to concretely debug it or look for the root cause. I believe it is due to sv-parser, and it occurs for very long systemverilog statements. Modifying the code to split these statements works, but is a cumbersome workaround. Feel free to dig into the morty and sv-parser code and find the root cause, I'd be happy to review a corresponding pull request!

micprog avatar Oct 26 '22 09:10 micprog

Hi Michael, thank you for your quick response!

Having no familiarity with the implementation of Morty and sv-parser, it's unlikely that I can find the root cause if you cannot yourself unfortunately. Therefore I'm fine with temporarily pre-processing some long systemverilog statements. Do you have any additional information about how to proceed in this direction (which type of statements, which length, typical problems)?

Thanks!

flaviens avatar Oct 26 '22 09:10 flaviens

I think the easiest way to find the culprit is to use morty's verbose mode to help figure out which file contains the issue. From my experience it is usually very long assign statements, but I haven't dug deep into the specifics.

micprog avatar Oct 26 '22 10:10 micprog

Thank you for your response! I made several experiments:

  1. Morty on each file individually: this way, I do not get any stack overflow. This hampers the direct use of -verbose for debugging unfortunately, since the bug comes from combining the files.
  2. Morty on each file individually and then Morty on the output of each file. I get the stack overflow.
  3. sv2v on each file individually and then Morty on the output of each file. I get the stack overflow, despite the source files looking very different compared to experiments 1 and 2. Maybe the problematic sequence of lines (experiment 1 showed that it is not a single line) has been preserved. Experiment 3 could suggest that the bug does not originate from the parser, without strong evidence though.

flaviens avatar Oct 26 '22 12:10 flaviens

I don't know how much I can help here. I tested your repository and, after removing mmq_spr.v due to errors (I believe begin with no end), I was able to run everything with the sequential implementation (on the branch that can sequentially propagate defines), albeit with the -i flag to bypass several failing files. To propagate defines, everything needs to be run sequentially, but the parallel implementation on the master branch seems to fail, I assume because simultaneous processing puts excessive load on the stack. If you run the sequential version with -v you should see the last file that has started parsing, so if there is an issue you should be able to detect it that way.

micprog avatar Oct 26 '22 13:10 micprog

Hi again! Sorry that it was not clear, you could get rid of the parse errors by rebasing to the fix_begin_genvar branch. I did it for you, you just need to force pull. I tried with -v, and it still stack overflows. The problem is that it displays a different last file before bug for each run. I also noticed that the order in which morty treats files is always different.

flaviens avatar Oct 26 '22 13:10 flaviens

Are you using the development version on the collect_defines branch? The master branch also shows the behavior you describe for me, AFAIK due to the parallel parsing of the files.

micprog avatar Oct 26 '22 13:10 micprog

@huettern also had the same problem with some large auto-generated Xilinx files. My assumption would be that this is due to an exceeded parse depth.

zarubaf avatar Oct 26 '22 14:10 zarubaf

Thank you for your answers! Indeed with the collect_defines files it seems to work :partying_face: ! However it's super slow, so not the best candidate for a master. How about a flag for parallelism?

flaviens avatar Oct 26 '22 14:10 flaviens

Great that it works. Indeed it is far slower, but depending on the desired feature set it is the way to go. I doubt there is much desire to spend time speeding up parsing significantly for morty, AFAIK pickling is an infrequent task that can tolerate some latency. I plan to add some flags for configuration to enable/disable define chaining, parallelism, etc. (one of the reasons the branch is not yet ready for merging), but haven't found the time.

micprog avatar Oct 26 '22 15:10 micprog

That would be great! Thanks!

flaviens avatar Oct 26 '22 15:10 flaviens

Hi @flaviens, Please check out #52 and let me know if it works for you.

micprog avatar Oct 27 '22 11:10 micprog

Nice, the -q flag seems to do the job! Why is --propagate_defines incompatible with --top, is it a conceptual mismatch or an implementation difficulty?

flaviens avatar Oct 27 '22 14:10 flaviens

Glad to hear that. --top removes files that are not needed for the specified top module, while --propagate_defines directly chains the parsed files due to included defines. Morty currently does not directly investigate all internal constructs, only modules, packages, and interfaces, so global typedefs (as you were using in #49) or other top-level elements may not be handled correctly. Arguably the incompatibility could result in a warning instead of ignoring the top module.

micprog avatar Oct 27 '22 14:10 micprog