cocotb icon indicating copy to clipboard operation
cocotb copied to clipboard

Cocotb only supports Verilator 4.106

Open themperek opened this issue 5 years ago • 16 comments

cocotb: master 9b442d07870751ff656eb1ed53d24a3d806e9610 os: centos 7 64bit simulator: Verilator 4.107 devel Python: 3.7.6 (conda)

log:

     -.--ns INFO     cocotb.gpi                         ..mbed/gpi_embed.cpp:74   in set_program_name_in_venv        Did not detect Python virtual environment. Using system-wide Python interpreter
     -.--ns INFO     cocotb.gpi                         ../gpi/GpiCommon.cpp:105  in gpi_print_registered_impl       VPI registered
     -.--ns INFO     cocotb.gpi                         ..mbed/gpi_embed.cpp:244  in embed_sim_init                  Python interpreter initialized and cocotb loaded!
     0.00ns INFO     cocotb                                      __init__.py:212  in _initialise_testbench           Running on Verilator version 4.107 devel
     0.00ns INFO     cocotb                                      __init__.py:219  in _initialise_testbench           Running tests with cocotb v1.5.0.dev0 from /themperek/git/cocotb/cocotb
     0.00ns INFO     cocotb                                      __init__.py:239  in _initialise_testbench           Seeding Python random module with 1608640613
     0.00ns INFO     cocotb.regression                         regression.py:127  in __init__                        Found test test_dff.test_dff_simple
     0.00ns INFO     cocotb.regression                         regression.py:466  in _start_test                     Running test 1/1: test_dff_simple
     0.00ns INFO     ..st.test_dff_simple.0x7f08b057d650       decorators.py:313  in _advance                        Starting test: "test_dff_simple"
                                                                                                                     Description:  Test that d propagates to q 

simple_dff_verilator.log

themperek avatar Dec 22 '20 12:12 themperek

It seems that Verilator 4.106 works.

themperek avatar Dec 22 '20 17:12 themperek

Looking at it, there were recent changes to the Verilator VPI code relating to iterating over the Module hierarchy. I can have a look at this soon, but setting VERILATOR_SIM_DEBUG=1 and COCOTB_LOG_LEVEL=DEBUG should help figure out what's going on.

marlonjames avatar Dec 22 '20 18:12 marlonjames

@garmin-mjames I have added log file. It is cut it goes like this forever.

themperek avatar Dec 23 '20 09:12 themperek

Tracked this down to a GPI bug where we leak callback handles and never call vpi_remove_cb() on the callback for NextTimeStep, ReadWrite, ReadOnly triggers.

Verilator gets stuck in an infinite loop because we don't remove the ReadWrite callback.

marlonjames avatar Dec 24 '20 06:12 marlonjames

After thinking about this more, and reviewing the VPI sections of the standard, I now think this should be categorized as a Verilator bug, not a GPI bug. Below is my understanding and rationale:

37.2 VPI Handles A handle is an opaque reference to an object in the VPI information model. [...]

37.2.4 Validity of handles The lifetime of an object is the duration of existence of the object in the VPI information model. [...] A tool can create a handle that refers to an object only during the lifetime of the object. A handle is said to be valid from the time of its creation until the time at which it is released, or until the object that it refers to ceases to exist, or until termination of the tool; at other times it is invalid. A VPI program shall not refer to an object using an invalid handle, or shall a VPI program attempt to release an invalid handle.

37.3.7 Lifetime of objects [...] Objects that may have a lifetime shorter than the duration of the simulation are called transient objects. [...]

Other transient objects include the following: e) Callbacks (see 38.36)

37.3.8 Managing transient objects One may obtain a handle to an object during its lifetime, and it remains valid only as long as the object exists. [...] For a transient object, one may release its handle after use or expect that handle to be released and become invalid when the object ceases to exist.

Because callbacks are transient objects, provided the lifetime of the callback is understood, releasing handle(s) to the callbacks is not necessary.

Further, some callbacks are one-time callbacks, and do not need to be removed using vpi_remove_cb(). I think for some simulators, the callback may be removed from whatever internal data structure is used to track registered callbacks before it is called, so vpi_release_handle()/vpi_remove_cb() may be causing an error (This may be the problem with Modelsim that was interpreted as "double freeing" the handle?) Although it should probably just be a PLI error and not a segfault, the standard does say "A VPI program shall not refer to an object using an invalid handle, nor shall a VPI program attempt to release an invalid handle"

For Simulation time callbacks (38.36.2) such as cbReadWriteSynch and cbReadOnlySynch, Verilator does not look at the cb_data_p->time field at all, and registers the callbacks as a repeated callback, which seems to be different than other simulators. I just confirmed in Active-HDL that registering a cbReadWriteSynch with time 0 will call the callback only once.

marlonjames avatar Jan 19 '21 22:01 marlonjames

and registers the callbacks as a repeated callback, which seems to be different than other simulators.

The behavior of only servicing callbacks once per registry makes sense to me. If Verilator isn't behaving the same, that sounds like a bug to me.

This may be the problem with Modelsim that was interpreted as "double freeing" the handle?

Ugh... I hope not. We have to ask if that mean that Modelsim makes different assumptions, or that other simulators are working by accident.

Nice work tracking this down.

ktbarrett avatar Jan 19 '21 23:01 ktbarrett

Since this is potentially a bug in Verilator, should this be removed from 1.5?

ktbarrett avatar Jan 25 '21 17:01 ktbarrett

It can be, but that means ONLY Verilator 4.106 is supported, since 4.108 has the change that breaks cocotb.

marlonjames avatar Jan 25 '21 17:01 marlonjames

It's not like Verilator is frozen in time. Eventually it will be fixed and new versions will work. If you want we could add a rule to the makefiles that says 4.108 will not work, but who knows now if 4.110 will? So we either block cocotb's release until one simulator is fixed (without knowing how long that will take), or we release it as is knowing eventually Verilator will be fixed.

ktbarrett avatar Jan 25 '21 17:01 ktbarrett

The 1.3 release notes had a caveat on Verilator support (https://docs.cocotb.org/en/stable/release_notes.html#new-features). We may want to add something like that to the Simulator Support page (or do #2097).

marlonjames avatar Jan 25 '21 17:01 marlonjames

I agree that outright stating that Verilator support is still experimental is probably a good idea. But I'm trying to whittle down work before we can release 1.5, not do more 😄. I'll add a note.

ktbarrett avatar Jan 25 '21 18:01 ktbarrett

Upstream bug: https://github.com/verilator/verilator/issues/2778

imphil avatar May 08 '21 22:05 imphil

Yes, it hangs for me too when i use verilator

varain007 avatar Sep 09 '21 09:09 varain007

@varain007 Only verilator 4.106 is supported right now.

ktbarrett avatar Sep 09 '21 13:09 ktbarrett

To add fuel to this, Verilator 4.106 has a bug which results in:

/usr/share/verilator/include/verilated.cpp: In function ‘IData VL_FGETS_NI(std::string&, IData)’:
/usr/share/verilator/include/verilated.cpp:1303:36: error: ‘numeric_limits’ is not a member of ‘std’
 1303 |     return getLine(dest, fpi, std::numeric_limits<size_t>::max());
      |                                    ^~~~~~~~~~~~~~
/usr/share/verilator/include/verilated.cpp:1303:57: error: expected primary-expression before ‘>’ token
 1303 |     return getLine(dest, fpi, std::numeric_limits<size_t>::max());
      |                                                         ^
/usr/share/verilator/include/verilated.cpp:1303:60: error: ‘::max’ has not been declared; did you mean ‘std::max’?
 1303 |     return getLine(dest, fpi, std::numeric_limits<size_t>::max());
      |                                                            ^~~
      |                                                            std::max
In file included from /usr/include/c++/11.1.0/algorithm:62,
                 from /usr/share/verilator/include/verilated_heavy.h:29,
                 from /usr/share/verilator/include/verilated_imp.h:29,
                 from /usr/share/verilator/include/verilated.cpp:25:
/usr/include/c++/11.1.0/bits/stl_algo.h:3467:5: note: ‘std::max’ declared here
 3467 |     max(initializer_list<_Tp> __l, _Compare __comp)
      |     ^~~

for even trivial builds (e.g. a MUX tester) - patched in 4.110. So it would be great if this could get solved on the Verilator side...

benjaminmordaunt avatar Sep 24 '21 12:09 benjaminmordaunt

@benjaminmordaunt I also encountered this issue with GCC 11.2. A quick fix is to edit $(PREFIX)/share/verilator/include/verilated.cpp and add a line to include the header file that defines std::numeric_limits (#include <limits>)

vighneshiyer avatar Mar 31 '22 16:03 vighneshiyer

Thanks @vighneshiyer for providing this fix. With that it works for now.

Nevertheless: Is there any chance of this moving forward? Verilator&CocoTB in combination are just awesome and especially for exhaustive module testing there is hardly any better combo right now.

I'm more than happy to try out new stuff with my testbenches but it seems that this has ground to a halt somehow.

markusdd avatar Nov 13 '22 21:11 markusdd

Verilator 5.004 has a fix for the callback issue this issue was about (https://github.com/verilator/verilator/commit/c2bdd06fccd4ef551948ed6dfe0eff76004fa117), hence I'm closing this issue.

There are still a number of test failures in our test suite, and Verilator has moved on since 4.106 times and there are some things we can benefit from in cocotb as well (e.g., timing support!). I filed https://github.com/cocotb/cocotb/issues/3194 to keep track of the remaining work.

imphil avatar Jan 22 '23 22:01 imphil

Based on this list list it should be fixed on Ubuntu starting with version 23.04 .

bat52 avatar Jun 15 '23 17:06 bat52