cocotb
cocotb copied to clipboard
Proposal: revert handle.py cast removals
This commit https://github.com/cocotb/cocotb/commit/a59fba89fe177d9d0d2532238a55f85a0a18e4f1 breaks a lot of code
I think str(dut.myhandle) should be removed, because the cast does something that's sort of ambiguous- it seems like it might interpret the value as a string, but it actually returns the path.
int(dut.myhandle) has always returned the value as an int and turning it into int(dut.myhandle.value) doesn't make it any more readable. So I think it makes sense to add int() and float() back in. I agree in general builtin overrides aren't a great solution because of the ambiguity, but I think the cost of the amount of code these break justifies leaving them in.
We are ramping up for a 2.0 release, there is no guarantee of backwards compatibility from here on out. The interface to logic values is going to change considerably, as is how tasks are handled. No matter how you shake it, if you want to upgrade 1.X code to 2.X you are going to have to do considerable code changes.
I understand there's going to be considerable code changes, it just doesn't seem worth it to take this out
int
casts on logic vectors are ambiguous. Did you want unsigned or two's compliment? That makes float casts on them ambiguous. Retaining int and float casts on handles where it isn't unambiguous (IntegerObject or RealObject) would be fine, but I doubt many people are using them, it's ugly syntax, and inconsistent since we can't support them on all value types. Add to that that people are already going to have to go over their code, keeping it doesn't seem worthwhile.
Is the plan to remove __index__()
from BinaryValue
as well? Because:
int(dut.foo)
is the same as
int(dut.foo.value)
but in the case of:
logic signed [63:0] foo;
initial foo = -64'd123;
one gets:
(Pdb) int(dut.foo.value)
18446744073709551493
whereas I assume the intended path is to know the representation of foo
and do:
(Pdb) dut.foo.value.signed_integer
-123
__index__()
or int()
of a BinaryValue
is ambiguous for the same reasons stated above. But at the end of the day one just needs to "know" what the representation of a handle is. BinaryValue.integer
doesn't change that either. Ultimately int()
on a handle is just calling BinaryValue.integer
:
def __index__(self):
return self.integer
Perhaps the GPI can help here by discovering the signedness of a handle to help BinaryValue
do the right thing? But in the short term can you give a specific example of how removing __index__()
from handles makes things less ambiguous? Apologies if I'm missing something here but it seems like this is just removing convenience methods to ambiguity that continues to exist.
Is the plan to remove
__index__()
fromBinaryValue
as well?
Yes.
BinaryValue.integer
doesn't change that either.
Technically it does, it's just poorly named. There is a BinaryValue.signed_integer
, so there is a distinction between unsigned and signed representations. But another issue with this interface is the effect of endianness. Perhaps that should or shouldn't be a part of the new interface? Is the logic array "0001100011000011"
a little endian 2 8-bit byte words, big endian 2 8-bit byte words or a left-MSB bit vector? (AFAIK no one uses right-MSB bit vectors, so we can ignore that).
Perhaps we should build into the class the assumption that the logic array is a left-MSB bit array and if people want different representations, they are going to have to use a different interface? So we can keep the interface small and the implementation simple?
Yeah, endianness is unfortunately not expressable in the RTL. So even if we could discover signedness, I don't believe it's fully possible to "just know" the right way to express a signal's value as an integer. And since handle.py
constructs BinaryValue
s on the fly you're stuck either having to somehow change binaryRepresentation
after you've fetched the handle's BinaryValue
(don't believe this is possible today) or you have to provide unsigned_big_endian_integer
, signed_big_endian_integer
, unsigned_little_endian_integer
, signed_little_endian_integer
.
It seems like it would be better if the handle could know or be told what the binary representation is. That way the user could specify this at driver / monitor construction time (or whatever) and be done with it. Also, if we're going down that path I'd suggest that we provide a way to make a DUT-wide default setting. And then if you had that you could go back to exposing __index__()
on handles which would do the right thing unless it had not been told the binary representation and hadn't been given a default, in which case it would raise an error.
Thoughts?
So we did decide to reintroduce some of the non-ambigous casts like IntegerObject.__int__
and RealObject.__float__
, and LogicObject.__str__
, but we will leave SimHandleBase.__str__
, LogicObject.__int__
, LogicObject.__float__
removed as the first is not a good idea really and the last two are ambigous. But the casts will be reintroduced as deprecated. We slightly changed our plan for 2.0 instead of being the big break, to be the judicious break and deprecate the rest.
I'll get to @toddstrader's most recent comment in a bit.
What you mentioned would work and allow int
casts on BinaryValue
and LogicObject
to continue to work, but adds state to every handle and all the values coming out of it, making things complicated. While we didn't think of that to discuss it, I don't know if that's going to get much love with the other maintainers. We are definitely going forward with the approach of stateless LogicArray
type and the user being explicit at the point of use what the representation should be.
We did decide today that we do want some amount of backwards compatibility. IMO that means we leave .integer
, .signed_integer
, and .binstr
on the LogicArray interface as they are in fairly common use. That would also allow us to leave LogicObject.__int__
and LogicObject.__float__
as is. Of course all of that will be deprecated and removed in 3.0 which would be at least a year out.
No idea what the new interface for constructing ints looks like yet... I'll make sure I'll ping you when we go about adding them. But is there anything else you might think would be useful outside of two's complement vs unsigned and big vs little endian and maybe selecting how many bits are in a byte?
But is there anything else you might think would be useful outside of two's complement vs unsigned and big vs little endian and maybe selecting how many bits are in a byte?
I think that covers all the reasonable bases. Of course, the list is endless. You could have parity or ECC embedded with data, various fixed or floating point representations or BCD values, for example. But I don't think it's cocotb's job to solve all these problems.
Also, somewhat related, I've been pondering simulator.get_signal_value_vector()
which would use vpiVector
for a more compact (and presumably speedy) encoding of the data (obviously without strength data). That way the data could be held in Python in aval
and bval
bytes
objects. This seems fundamentally incompatible with LogicArray
, but I'm wondering if there's a base class to be had here, at which point the user could tell cocotb which logic / data representation class to use for handle data. This could even potentially allow users to bring their own 9-bit byte, pig latin, BCD logic classes, which would further absolve cocotb from worrying about every possible encoding.
Also, somewhat related, I've been pondering
simulator.get_signal_value_vector()
which would usevpiVector
for a more compact (and presumably speedy) encoding of the data (obviously without strength data). That way the data could be held in Python inaval
andbval
bytes
objects.
I'm not entirely sure what you mean here. What does vpiVector
return?
Sorry, I should have said vpiVectorVal
. E.g.:
logic [63:0] lgc_64;
initial lgc_64 = 64'h12345678abcdefxx;
plus
vpiHandle lgc64Handle = vpi_handle_by_name("dut.lgc_64", NULL);
s_vpi_value lgc64Value;
lgc64Value.format = vpiVectorVal;
vpi_get_value(lgc64Handle, &lgc64Value);
for (i = 0; i < 2; i ++) {
printf("LGC_64 [%d] a = %08x b = %08x\n", i, lgc64Value.value.vector[i].aval, lgc64Value.value.vector[i].bval);
}
yields
LGC_64 [0] a = abcdefff b = 000000ff
LGC_64 [1] a = 12345678 b = 00000000