codon icon indicating copy to clipboard operation
codon copied to clipboard

Define a new data type like UInt[N]

Open iSunfield opened this issue 2 years ago • 9 comments

I'm trying to define and perform custom operations on a data type with a variable bit length, similar to UInt[N]. Inspired by the UInt[N] code, I've created the following Newdt class. However, I'm running into LLVM errors because the new data type is not defined on LLVM. I'm fine with having the same data structure as UInt[N], I just need to get LLVM to recognize the data type and allow type casting. Is there a way to cast to a new data type in Codon?

#_divmod function is copied form Int[N] class

def _divmod_10(dividend, N: Static[int]):
    T = type(dividend)
    zero, one = T(0), T(1)
    neg = dividend < zero
    dvd = dividend.__abs__()

   remainder = 0
   quotient = zero

    # Euclidean division
    for bit_idx in range(N - 1, -1, -1):
        mask = int((dvd & (one << T(bit_idx))) != zero)
        remainder = (remainder << 1) + mask
        if remainder >= 10:
            quotient = (quotient << one) + one
            remainder -= 10
        else:
            quotient = quotient << one

    if neg:
        quotient = -quotient
        remainder = -remainder

    return quotient, remainder

@tuple
@__internal__ 
@__notuple__
class Newdt[N: Static[int]]:
    pass

@extend
class Newdt:
    def __new__() -> Newdt[N]:
        return Newdt[N](0)
    def __new__(what: int) -> Newdt[N]:
        print(sizeof(what))
        if N < 64:
            return Newdt[N](__internal__.int_trunc(what, 64, N))
        elif N == 64:
            return Newdt[N](Int[N](what))
        else:
            return Newdt[N](__internal__.int_sext(what, 64, N))

    @pure
    @llvm
    def __new__(what: Int[N]) -> Newdt[N]:
        ret i{=N} %what

#Below codes are copied form Int[N] class
    def __repr__(self) -> str:
        return f"Int[{N}]({self.__str__()})"

    def __str__(self) -> str:
        if N <= 64:
            return str(int(self))

        if not self:
            return '0'

        s = _strbuf()
        d = self

        if d >= Int[N](0):
            while True:
                d, m = _divmod_10(d, N)
                b = byte(48 + m)  # 48 == ord('0')
                s.append(str(__ptr__(b), 1))
                if not d:
                    break
        else:
            while True:
                d, m = _divmod_10(d, N)
                b = byte(48 - m)  # 48 == ord('0')
                s.append(str(__ptr__(b), 1))

                if not d:
                    break
            s.append('-')

        s.reverse()
        return s.__str__()

    def __int__(self) -> int:
        if N > 64:
            return __internal__.int_trunc(self, N, 64)
        elif N == 64:
            return Int[64](self)
        else:
            return __internal__.int_sext(self, N, 64)

@extend
class Int:
    @pure
    @llvm
    def __new__(what: Newdt[N]) -> Int[N]:

        ret i{=N} %what


Test code for new data type

a = Newdt[64](10)
print(a)


 error: LLVM: error: value doesn't match function result type '{}
          ret i64 %what
            ^
 (Newdt[64]:Newdt.__new__:2[Int[64]])

iSunfield avatar Jun 12 '23 09:06 iSunfield

@iSunfield You can define a new UInt[N] data type like this:

class Newdt[N: Static[int]]:
    number: UInt[N]

    def __init__(self, number: int):
        self.number = UInt[N](number)

    def __init__(self, number: Newdt[N]):
        self.number = number.number

    def __str__(self) -> str:
        return "Number: " + str(self.number)

    def new_method(self):
        return "New method..."

    def __add__(self, other: Newdt[N]) -> Newdt[N]:
        return Newdt[N](self.number + other.number)


a = Newdt[64](Int[64](100))
b = Newdt[64](Int[64](10))
print(a)
print(a.new_method())
print(isinstance(a, Newdt))
print(a + b)

print(Newdt[64](Newdt[64](1000)))

elisbyberi avatar Jun 12 '23 14:06 elisbyberi

As @elisbyberi mentioned I'd also suggest just wrapping a UInt[N]. You can also consider making the Newdt type a tuple type so it's passed around by value just like UInt[N] (that way no allocations take place when you instantiate it):

@tuple
class Newdt:
    number: UInt[N]
    N: Static[int]  # same as 'class Newdt[N: Static[int]]'

    def __new__(number: int):
        return Newdt(UInt[N](number))

    def __str__(self) -> str:
        return "Number: " + str(self.number)

    def new_method(self):
        return "New method..."

    def __add__(self, other: Newdt[N]) -> Newdt[N]:
        return Newdt[N](self.number + other.number)

arshajii avatar Jun 12 '23 14:06 arshajii

I greatly appreciate your advice. The @tuple annotation means no memory allocation is made for the instance. I can realize the function I wanted to implement in Codon. I will post code to compare the difference in memory allocation with and without the @tuple annotation on class.

@tuple
class Newdt[N: Static[int]]:
    val: UInt[N]
    
    def __new__(v: int):
        return Newdt(UInt[N](v))

    def __new__(v: int):
        return Newdt(UInt[N](v))
    
    def __setitem__(self,v: int):
        self.val = UInt[N](v)
    
    def __getitem__(self) -> UInt[N]:
        return self.val

    def __init__(self,v: int):
        pass

Dt = Array[Newdt[12]](10)
    
Dt[0] = Newdt[12](10)
Dt[1] = Newdt[12](20)

DtP = Dt.ptr
DtPBytePtr = DtP.as_byte()

for i in range(2):
    print('Allocation data of Dt['+str(i)+']:',end='')
    for j in range(8):
        print(int(DtPBytePtr[i*8+j]),end=',')
    print()

Result without @tuple

Allocation data of Dt[0]:240,159,216,181,66,127,0,0,
Allocation data of Dt[1]:224,159,216,181,66,127,0,0,

Result with @tuple

Allocation data of Dt[0]:10,0,20,0,0,0,0,0,
Allocation data of Dt[1]:0,0,0,0,0,0,0,0,

I was able to confirm that Newdt is linearly allocated in the Array memory. very thanks again for your support.

iSunfield avatar Jun 13 '23 02:06 iSunfield

@iSunfield For correctness, here's the fixed code to print bytes:

@tuple
class Newdt[N: Static[int]]:
    val: UInt[N]

    def __new__(v: int):
        return Newdt(UInt[N](v))

    def __init__(self, v: int):
        self.val = UInt[N](v)

    def __repr__(self):
        return str(self.val)


size: Static[int] = 4  # int size in bytes
length = 10  # array length

Dt = Array[Newdt[size * 8]](length)
for i in range(length):
    Dt[i] = Newdt[size * 8](i)

DtP = Dt.ptr
DtPBytePtr = DtP.as_byte()
for i in range(length):
    print('Allocation data of Dt[' + str(i) + ']: ', end='')

    for j in range(size):
        b = int(DtPBytePtr[i * size + j])
        print(b, end=', ')

    print()

prints:

Allocation data of Dt[0]: 0, 0, 0, 0, 
Allocation data of Dt[1]: 1, 0, 0, 0, 
Allocation data of Dt[2]: 2, 0, 0, 0, 
Allocation data of Dt[3]: 3, 0, 0, 0, 
Allocation data of Dt[4]: 4, 0, 0, 0, 
Allocation data of Dt[5]: 5, 0, 0, 0, 
Allocation data of Dt[6]: 6, 0, 0, 0, 
Allocation data of Dt[7]: 7, 0, 0, 0, 
Allocation data of Dt[8]: 8, 0, 0, 0, 
Allocation data of Dt[9]: 9, 0, 0, 0, 

elisbyberi avatar Jun 13 '23 20:06 elisbyberi

I'm sorry for the delayed response. Thank you for correcting the mistake. When I try to add the addition operation to a new data format mimicking UInt[N], I encounter the following error. Additionally, when I tried to use the addition of Newdt as the return value of a function, recursive calls occurred and it didn't work correctly. How should I describe the operations in the new data format?

@tuple
class Newdt[N: Static[int]]:
    val: UInt[N]

    def __new__(v: int):
        return Newdt(UInt[N](v))

    def __init__(self, v: int):
        self.val = UInt[N](v)

    def __repr__(self):
        return str(self.val)

    @pure
    @commutative
    @associative
    @llvm
    def __add__(self, other: Newdt[N]) -> Newdt[N]:
        %0 = add i{=N} %self, %other
        ret i{=N} %0

size: Static[int] = 4  # int size in bytes
length = 10  # array length

Dt = Array[Newdt[size * 8]](length)
for i in range(length):
    Dt[i] = Newdt[size * 8](i) + Newdt[size * 8](i) # <-- Change code, pluse add operation

DtP = Dt.ptr
DtPBytePtr = DtP.as_byte()
for i in range(length):
    print('Allocation data of Dt[' + str(i) + ']: ', end='')

    for j in range(size):
        b = int(DtPBytePtr[i * size + j])
        print(b, end=', ')

    print()

error

error: LLVM: error: '%self' defined with type '{ i32 }' but expected 'i32'
        %0 = add i32 %self, %other
                     ^
 (Newdt[32]:Newdt.__add__:0[Newdt[32],Newdt[32]])

Changed add as below. Recursive calls occurred!!

    def __add__(self, other: Newdt[N]) -> Newdt[N]:
        print("Operation __add__")
        return self + other

Result

Operation __add__
Operation __add__
Operation __add__
Operation __add__
Operation __add__
Operation ^C *  Terminal will be reused by tasks, press any key to close it. 

iSunfield avatar Jun 20 '23 16:06 iSunfield

@iSunfield + operator is a syntactic sugar for __add__ method.

That means __add__ method is calling itself, self + other is same as self.__add__(other)

Remember, Newdt is just a wrapper of Uint. This is the implementation of __add__ as shown previously:

def __add__(self, other: Newdt[N]) -> Newdt[N]:
        return Newdt[N](self.val + other.val)

I don't believe using Inline LLVM IR here would result in any performance improvements.

elisbyberi avatar Jun 20 '23 17:06 elisbyberi

@elisbyberi, Thank you very much for your kind support. I have confirmed that the performance of Newdt[N] is equivalent to that of UInt[N]. I am currently attempting to create a Python library for the Newdt class using below command, but I encountered an error. I would like to know if it is possible to create the Newdt library same as using numpy.

codon build -release --relocation-model=pic -pyext -o build/lib.linux-x86_64-3.10/vdt.cpython-310-x86_64-linux-gnu.so.o -module vdt src/vdt/vdt.py
error: cannot realize 'Newdt' for Python export

Code of vdt.py

@tuple
class Newdt[N: Static[int]]:
    val: UInt[N]

    def __new__(v: int):
        return Newdt(UInt[N](v))

    def __init__(self, v: int):
        self.val = UInt[N](v)

    def __repr__(self):
        return str(self.val)

    def __add__(self, other: Newdt[N]) -> Newdt[N]:
        print("Operation __add__",self,other)
        return Newdt[N](self.val + other.val)

iSunfield avatar Jun 21 '23 04:06 iSunfield

@iSunfield I would recommend against exporting these Codon-specific classes. To prevent their export, you can use the @tuple(python=False) decorator.

Instead, it would be better to define a function that utilizes this class:

def add_64(a, b) -> int:
    c = Newdt[64](a) + Newdt[64](b)

    return int(c.val)

elisbyberi avatar Jun 25 '23 21:06 elisbyberi

Unfortunately, generic classes (such as Newdt[N]) cannot be exported to Python. You will need to instantiate them for that.

inumanag avatar Jul 26 '23 13:07 inumanag