typed_python icon indicating copy to clipboard operation
typed_python copied to clipboard

getting started with classes

Open punkdit opened this issue 4 years ago • 7 comments

Am I doing this right?

class Number(Class, Final):
    value = Member(int)

    def __init__(self, val):
        self.value = val
    
    def __add__(self, other):
        value = self.value + other.value
        return Number(value) 
  
@Entrypoint
def main():

    a = Number(77)
    b = Number(2)

    for i in range(10000000):
        c = a+b
        a = c
    print(c)

Without the @Entrypoint it runs without any llvm magic, is that right?

When I do add the @Entrypoint then everything gets compiled (or the compiler barfs) ? Is that right? I'd like to know if it ever sneaks back into the python interpreter, like how cython does.

Here is a second attempt, where I try overloading __add__. It fails to compile.

Number = Forward("Number")

class Number(Class, Final):
    value = Member(int)
    
    def __init__(self, val):
        self.value = val
        
    def __str__(self):
        return str(self.value)
        
    def __add__(self, other:Number):
        value = self.value + other.value
        return Number(value) 
        
    def __add__(self, other:int):
        value = self.value + value
        return Number(value) 

Also, without the Final, it complains that I haven't annotated a return type for __add__. Can I annotate that __add__ returns a Number ? Or do I just stick with Final?

I'm glad there is a comprehensive test suite, but lines like this: __add__ = lambda self, other: C("add") make no sense to me. Maybe someone should add a directory of example code ? Examples that don't compile/run would also be good to see.

punkdit avatar Feb 25 '21 19:02 punkdit

  1. yes you are generally doing the right thing. There are still some ways the system is finicky if you don't know the rules, and making that less obscure is something i'd love to do, so your feedback is super helpful.
  2. yes, you have to cross an @Entrypoint barrier before it invokes the compiler. It attempts compile all the way down. It will drop back into the interpreter if it can't figure out what's going on (for instance, if you use a python builtin it doesn't understand). You can see what it's compiling by running with the TP_COMPILER_VERBOSE environment variable set to 1, 2, 3, 4, or 5, to get increasing levels of detail about what its compiling. Accidentally hitting the interpreter is super annoying, so I was planning on trying to build something to assert that everything is well typed (as an argument to Entrypoint, for instance). Feedback on that plan welcome. For the most part, the intent is that the compiler does not throw exceptions - if it doesn't understand something it should hit the interpreter, and then we should allow features that you can use to assert that things are compiled. Along these lines, you can 'from typed_python import isCompiled' and then 'assert isCompiled' within your code, if you're trying to determine whether you are running in compiled mode or the interpreter.
  3. what's the error on compilation? That ought to work - it's probably just something stupid I neeed to fix. I did a whole pass on implicit conversion rules recently - we don't have that many cases of operator overloading with classes, so maybe that's a regression I can fix easily.
  4. if you don't put Final, then the compiller wants type annotations. This lets it generate something akin to a vtable in C++ so that it can compile against the base class and dispatch to overloads in subclasses. 'Final' means you can't have a subclass, so ethe compiler doesn't dispatch against the vtable and just uses the type directly (this is usually much faster because it can inline).
  5. if you do want to leave 'Final' off, then yes, you can annotate -> Number.
  6. I agree - we have inhouse projects (where I work) that use this extensively, but not that much client code in the public domain. I did just push up a 'sorted_dict' class in typed_python/lib that is a reasonable example of using TP to build a fairly nontrivial class, and I'm hoping to add more in the future. The tests are mostly there to verify that the system works, as they're not the best examples.

On Thu, Feb 25, 2021 at 2:24 PM Simon Burton [email protected] wrote:

Am I doing this right?

class Number(Class, Final): value = Member(int)

def __init__(self, val):
    self.value = val

def __add__(self, other):
    value = self.value + other.value
    return Number(value)

@Entrypoint def main():

a = Number(77)
b = Number(2)

for i in range(10000000):
    c = a+b
    a = c
print(c)

Without the "@entrypoint https://github.com/entrypoint" it runs without any llvm magic, is that right?

When I do add the "@entrypoint https://github.com/entrypoint" then everything gets compiled (or the compiler barfs) ? Is that right? I'd like to know if it ever sneaks back into the python interpreter, like how cython does.

Here is a second attempt, where I try overloading add. It fails to compile.

Number = Forward("Number")

class Number(Class, Final): value = Member(int)

def __init__(self, val):
    self.value = val

def __str__(self):
    return str(self.value)

def __add__(self, other:Number):
    value = self.value + other.value
    return Number(value)

def __add__(self, other:int):
    value = self.value + value
    return Number(value)

Also, without the Final, it complains that I haven't annotated a return type for add. Can I annotate that add returns a Number ? Or do I just stick with Final?

I'm glad there is a comprehensive test suite, but lines like this: add = lambda self, other: C("add") make no sense to me. Maybe someone should add a directory of example code ? Examples that don't compile/run would also be good to see.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/APrioriInvestments/typed_python/issues/385, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB6OHBF3W52K2XME5QGQZ6LTA2PYDANCNFSM4YHC3MXA .

braxtonmckee avatar Feb 25 '21 19:02 braxtonmckee

Number = Forward("Number")

class Number(Class, Final):
    value = Member(int)
    
    def __init__(self, val):
        self.value = val
        
    def __str__(self):
        return str(self.value)
        
    def __add__(self, other:Number):
        value = self.value + other.value
        return Number(value) 
        
    def __add__(self, other:int):
        v = self.value + value
        return Number(v)

        
@Entrypoint
def main():

    a = Number(77)
    b = Number(2)

    for i in range(10000000):
        c = a+b
        d = a+2
    print(c)

The error is:

Traceback (most recent call last):
  File "./test_typed_python.py", line 68, in <module>
    main()
  File "./test_typed_python.py", line 59, in main
    for i in range(10000000):
  File "./test_typed_python.py", line 61, in main
    d = a+2
  File "./test_typed_python.py", line 45, in __add__
    value = self.value + other.value
AttributeError: int object has no attribute 'value'

Also, trying to return Number from __add__:

Number = Forward("Number")

class Number(Class, Final):
    value = Member(int)
    
    def __init__(self, val):
        self.value = val
        
    def __str__(self):
        return str(self.value)
        
    def __add__(self, other:Number) -> Number:
        value = self.value + other.value
        return Number(value) 
        
@Entrypoint
def main():

    a = Number(77)
    b = Number(2)

    for i in range(10000000):
        c = a+b
        b = a
    print(c)

Raises an assertion error:

Traceback (most recent call last):
  File "./test_typed_python.py", line 63, in <module>
    main()
  File "/home/simon/site-packages/typed_python/compiler/runtime.py", line 284, in compileFunctionOverload
    callTarget = self.converter.convertTypedFunctionCall(
  File "/home/simon/site-packages/typed_python/compiler/python_to_native_converter.py", line 634, in convertTypedFunctionCall
    return self.convert(
  File "/home/simon/site-packages/typed_python/compiler/python_to_native_converter.py", line 807, in convert
    self._resolveAllInflightFunctions()
  File "/home/simon/site-packages/typed_python/compiler/python_to_native_converter.py", line 514, in _resolveAllInflightFunctions
    nativeFunction, actual_output_type = functionConverter.convertToNativeFunction()
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 467, in convertToNativeFunction
    body_native_expr, controlFlowReturns = self.convert_function_body(variableStates)
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 1159, in convert_function_body
    return self.convert_statement_list_ast(
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 2352, in convert_statement_list_ast
    res = self.convert_statement_ast(s, variableStates, controlFlowBlocks)
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 1365, in convert_statement_ast
    return self._convert_statement_ast(ast, variableStates, controlFlowBlocks)
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 1877, in _convert_statement_ast
    return self.convert_iteration_expression(to_iterate, ast, "", controlFlowBlocks)
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 2083, in convert_iteration_expression
    inner, innerReturns = self.convert_statement_list_ast(
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 2352, in convert_statement_list_ast
    res = self.convert_statement_ast(s, variableStates, controlFlowBlocks)
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 1365, in convert_statement_ast
    return self._convert_statement_ast(ast, variableStates, controlFlowBlocks)
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 1606, in _convert_statement_ast
    true, true_returns = self.convert_statement_list_ast(
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 2352, in convert_statement_list_ast
    res = self.convert_statement_ast(s, variableStates, controlFlowBlocks)
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 1365, in convert_statement_ast
    return self._convert_statement_ast(ast, variableStates, controlFlowBlocks)
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 1452, in _convert_statement_ast
    val_to_store = subcontext.convert_expression_ast(ast.value)
  File "/home/simon/site-packages/typed_python/compiler/expression_conversion_context.py", line 1622, in convert_expression_ast
    return lhs.convert_bin_op(ast.op, rhs)
  File "/home/simon/site-packages/typed_python/compiler/typed_expression.py", line 280, in convert_bin_op
    return self.expr_type.convert_bin_op(self.context, self, op, rhs, inplace)
  File "/home/simon/site-packages/typed_python/compiler/type_wrappers/class_or_alternative_wrapper_mixin.py", line 434, in convert_bin_op
    return self.convert_method_call(context, l, magic, (r,), {})
  File "/home/simon/site-packages/typed_python/compiler/type_wrappers/class_wrapper.py", line 585, in convert_method_call
    return typeWrapper(func).convert_call(context, None, [instance] + list(args), kwargs)
  File "/home/simon/site-packages/typed_python/compiler/type_wrappers/python_typed_function_wrapper.py", line 313, in convert_call
    singleConvertedOverload = context.functionContext.converter.convert(
  File "/home/simon/site-packages/typed_python/compiler/python_to_native_converter.py", line 791, in convert
    functionConverter = self.createConversionContext(
  File "/home/simon/site-packages/typed_python/compiler/python_to_native_converter.py", line 237, in createConversionContext
    return ConverterType(
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 1118, in __init__
    self._constructInitialVarnameToType()
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 650, in _constructInitialVarnameToType
    self._varname_to_type[FunctionOutput] = typeWrapper(self._output_type)
  File "/home/simon/site-packages/typed_python/compiler/function_conversion_context.py", line 54, in <lambda>
    typeWrapper = lambda t: typed_python.compiler.python_object_representation.typedPythonTypeToTypeWrapper(t)
  File "/home/simon/site-packages/typed_python/compiler/python_object_representation.py", line 91, in typedPythonTypeToTypeWrapper
    _type_to_type_wrapper_cache[t] = _typedPythonTypeToTypeWrapper(t)
  File "/home/simon/site-packages/typed_python/compiler/python_object_representation.py", line 211, in _typedPythonTypeToTypeWrapper
    assert False, (t, getattr(t, '__typed_python_category__', None))
AssertionError: (<class '__main__.Number'>, 'Forward')
./test_typed_python.py:55
    c=None
    a=Number
    b=Number
:0
    .for.54.iteratorMaxValue=int
    .iterate_over.54=RangeCls
    i=int
    c=None
    .for.54.iteratorValue=int
    a=Number
    b=Number
./test_typed_python.py:54
    c=None
    i=int
    range=None
    a=Number
    b=Number

punkdit avatar Feb 25 '21 20:02 punkdit

Hi Simon,

You're hitting a slightly different bug. When you use forwards, you need to define the type they resolve to. So

Number = Forward("Number")

@Number.define
class Number(Class, Final):

    value = Member(int)

    def __init__(self, val):
        self.value = val

    def __str__(self):
        return str(self.value)

    def __add__(self, other: Number):
        value = self.value + other.value
        return Number(value)

    def __add__(self, other: int):
        v = self.value + other
        return Number(v)

will work. What you had before is a case of an 'Unresolved Forward'. The interpreter actually won't let you construct an instance of Number and throws an exception because of it (which is correct). The compiler isn't checking this case correctly, and then the compiler logic for overload matching must also have a bug dealing with the forward.

braxtonmckee avatar Feb 26 '21 03:02 braxtonmckee

How does str.format work?

class Number(Class, Final):
    value = Member(int)

    def __init__(self, val):
        self.value = val

    def __str__(self):
        return "Number({})".format(str(self.value))
    

gives error:

 File "./test_typed_python.py", line 16, in __str__
    return "Number({})".format(str(self.value))
TypeError: Can't call str.format with args of type ()

punkdit avatar Feb 26 '21 12:02 punkdit

Here is the beginnings of an arbitrary length integer type:

Int64 = int # for those of us that don't think like a C compiler
Long = Forward("Long")

@Long.define
class Long(Class):
    digits = Member(ListOf(Int64))

    def __init__(self, val:Int64):
        self.digits = [val]
    
    def __init__(self, val:ListOf(Int64)):
        self.digits = list(val)
    
    def __str__(self):
        return "Long("+str(self.digits)+")"

    def __add__(self, other:Long) -> Long:
        value = self.digits[0] + other.digits[0]
        return Long(value)

    def __add__(self, other:Int64) -> Long:
        value = self.digits[0] + other
        return Long(value)

My benchmark is here:

@Entrypoint
def main():

    a = Long(77)
    b = Long(2)

    for i in range(10000000):
        c = a+b
        b = b+1
    print(c)

This runs in about 50s compiled, and 1m25s uncompiled. Using python native int's it runs in about 1s. Is there something I am missing here?

Update: I replaced "ListOf(Int64)" with "TupleOf(Int64)" because these things don't need to be mutable. And now the benchmark runs in 7s. Hazah!

punkdit avatar Feb 26 '21 12:02 punkdit

from typed_python import Forward, Member, Class, ListOf, Final, Entrypoint
import time


Int64 = int # for those of us that don't think like a C compiler
Long = Forward("Long")

@Long.define
class Long(Class, Final):
    digits = Member(ListOf(Int64))

    def __init__(self, val:Int64):
        self.digits.append(val)
    
    def __init__(self, val: ListOf(Int64)):
        self.digits = val
    
    def __str__(self):
        return "Long("+str(self.digits)+")"

    def __add__(self, other: Long) -> Long:
        value = self.digits[0] + other.digits[0]
        return Long(value)

    def __add__(self, other: Int64) -> Long:
        value = self.digits[0] + other
        return Long(value)


def main2():
    a = 77
    b = 2

    for i in range(10000000):
        c = a+b
        b = b+1
    
    print(c)

@Entrypoint
def main():
    a = Long(77)
    b = Long(2)

    for i in range(10000000):
        c = a+b
        b = b+1
    print(c)

main()

t0 = time.time()
main()
print(time.time() - t0)

t0 = time.time()
main2()
print(time.time() - t0)

for me - this is about 4x slower than the interpreter. I took out 'self.digits = [val]' and replaced with self.digits.append(val), etc.

A few things are going on here:

  1. TP's refcounting under the hood uses atomic incref/decref, which is actually pretty slow compared to regular old integer operations. This can make code that involves lots of object creation/destruction slower than the interpreter. I'd like to do a serious pass of optimization on collapsing refcounts and promoting temporary objects whose lifetimes we know to the stack, but we basically do nothing about this right now.
  2. empty tuples don't require memory allocation at all, which gives you a bit of the performance win you noticed.
  3. constructing temporary lists like [val] is still pretty slow. There are internal mechanisms (TypedListMasqueradingAsList) that try to retain the typing on untyped lists, but its always faster (right now) to construct the typed objects directly.

Over the long run, for very performance sensitive cases, my plan is to extend the interface for classes so that if you want, you can have proper constructor/destructor/assign/move semantics like in C++, which would allow you to manage the internals of such a class much more directly - you could, for instance, directly allocate the memory you need, or implement refcounting yourself without atomics, etc. It would also let you code classes that live directly on the stack (as opposed to on the heap) which would get rid of a lot of the refcounting issues (at the cost of being less clear about the lifetime of temporaries). All the infrastructure is there for this, I just have to find time to do it. I think this would let you build something pretty fast, actually, since you could do the same optimization that C++ does for small strings, and pack smaller integer objects directly into the object structure (with no heap allocation).

braxtonmckee avatar Feb 26 '21 14:02 braxtonmckee

Separately, on string formatting, I guess we didn't explicitly model str.format. fstrings work although we have not implemented all the twiddly little formatters

braxtonmckee avatar Feb 26 '21 14:02 braxtonmckee