metatype icon indicating copy to clipboard operation
metatype copied to clipboard

feat(mdk): mdk python

Open michael-0acf4 opened this issue 1 year ago • 8 comments

Mdk for python runtime

Migration notes

None

Summary by CodeRabbit

  • New Features

    • Introduced new functionalities for generating Python code based on configurations, including handling of templates and required objects.
    • Added Python script templates for defining typed functions and structured objects with comprehensive data type handling.
    • Enhanced type management and priority handling in utility functions.
  • Documentation

    • Provided detailed summaries and documentation for new functionalities and templates.
  • Refactor

    • Implemented new structures and methods for efficient code generation and type handling.
  • Tests

    • Added tests for defining typegraph structures and policies in Python.
  • Chores

    • Updated URLs in the .ghjk/deno.lock file to reflect new changes in the codebase.

michael-0acf4 avatar May 03 '24 21:05 michael-0acf4

I ended up using class approach

Example output scripts/example.py (entry point)

from .example_types import Output, Object26, typed_duplicate, Input, typed_add, TypeAddResult


@typed_duplicate
def duplicate(inp: Input) -> Output:
    # TODO: write your logic here
    raise Exception("duplicate not implemented")

@typed_add
def add(inp: Object26) -> TypeAddResult:
    # TODO: write your logic here
    raise Exception("add not implemented")

scripts/example_types.py

from types import NoneType
from typing import Callable, List, Union, get_origin, ForwardRef
from dataclasses import dataclass, asdict, fields

FORWARD_REFS = {}

class Struct:
    def try_new(dt_class, val: any):
        # Object
        ftypes = {f.name: f.type for f in fields(dt_class)}
        attrs = {}
        for f in val:
            fval = val[f]
            ftype = ftypes[f]
            serialized = False
            # Union
            if get_origin(ftype) is Union:
                try:
                    attrs[f] = Struct.try_union(ftype.__args__, fval)
                    serialized = True
                except Exception as _e:
                    pass
            # List
            elif get_origin(ftype) is list:
                try:
                    attrs[f] = Struct.try_typed_list(ftype.__args__, fval)
                    serialized = True
                except Exception as _e:
                    pass
            # Any
            if not serialized:
                if type(ftype) is str and ftype in FORWARD_REFS:
                    klass = FORWARD_REFS[ftype]
                    attrs[f] = Struct.new(klass, fval)
                else:
                    attrs[f] = Struct.new(ftype, fval)
        return dt_class(**attrs)

    def try_typed_list(tpe: any, items: any):
        hint = tpe.__args__[0]
        klass = FORWARD_REFS[hint.__forward_arg__] if type(hint) is ForwardRef else hint
        return [Struct.new(klass, v) for v in items]

    def try_union(variants: List[any], val: any):
        errors = []
        for variant in variants:
            try:
                if variant is NoneType:
                    if val is None:
                        return None
                    else:
                        continue
                if get_origin(variant) is list:
                    if type(val) is list:
                        return Struct.try_typed_list(variant, val)
                    else:
                        continue
                klass = FORWARD_REFS[variant.__forward_arg__]
                return Struct.try_new(klass, val)
            except Exception as e:
                errors.append(str(e))
        raise Exception("\n".join(errors))


    def new(dt_class: any, val: any):
        try:
            return Struct.try_new(dt_class, val)
        except:
            return val

    def repr(self):
        return asdict(self)


@dataclass
class Input(Struct):
    string: str
    integer: int
    email: Union[str, None]
    list_integer: List[int]
    opt_union_flat: Union[Union[float, int], None]
    reference: Union[List['References'], None]
    self: Union[List['Example'], None]
    

FORWARD_REFS['Input'] = Input

@dataclass
class References(Struct):
    string: str
    example: Union['Example', None]
    

FORWARD_REFS['References'] = References

@dataclass
class Example(Struct):
    string: str
    integer: int
    email: Union[str, None]
    list_integer: List[int]
    opt_union_flat: Union[Union[int, float], None]
    reference: Union[List['References'], None]
    self: Union[List['Example'], None]
    

FORWARD_REFS['Example'] = Example

@dataclass
class Output(Struct):
    string: str
    integer: int
    email: Union[str, None]
    list_integer: List[int]
    opt_union_flat: Union[Union[float, int], None]
    reference: Union[List['References'], None]
    self: Union[List['Example'], None]
    

FORWARD_REFS['Output'] = Output

@dataclass
class Object26(Struct):
    a: int
    b: int
    

FORWARD_REFS['Object26'] = Object26

TypeAddResult = int


def __repr(value: any):
        if isinstance(value, Struct):
            return value.repr()
        return value


def typed_duplicate(user_fn: Callable[[Input], Output]):
    def exported_wrapper(raw_inp):
        inp: Input = Struct.new(Input, raw_inp)
        out: Output = user_fn(inp)
        if type(out) is list:
            return [__repr(v) for v in out]
        return __repr(out)
    return exported_wrapper

def typed_add(user_fn: Callable[[Object26], TypeAddResult]):
    def exported_wrapper(raw_inp):
        inp: Object26 = Struct.new(Object26, raw_inp)
        out: TypeAddResult = user_fn(inp)
        if type(out) is list:
            return [__repr(v) for v in out]
        return __repr(out)
    return exported_wrapper

typegraph.py

from typegraph import typegraph, Policy, t, Graph
from typegraph.runtimes.python import PythonRuntime


@typegraph()
def example(g: Graph):
    references = t.struct(
        {"string": t.string(), "example": g.ref("Example").optional()},
        name="References",
    )
    example = t.struct(
        {
            "string": t.string(),
            "integer": t.integer(),
            "email": t.email().optional(),
            "list_integer": t.list(t.integer()),
            "opt_union_flat": t.union([t.integer(), t.integer(), t.float()]).optional(),
            "reference": t.list(references).optional(),
            "self": t.list(g.ref("Example")).optional(),
        },
        name="Example",
    )
    python = PythonRuntime()
    pub = Policy.public()
    g.expose(
        pub,
        duplicate=python.import_(
            example.rename("Input"),
            example.rename("Output"),
            name="duplicate",
            module="scripts/example.py",
        ),
        add=python.import_(
            t.struct({"a": t.integer(), "b": t.integer() }),
            t.integer().rename("AddResult"),
            name="add",
            module="scripts/example.py",
        ),
    )

michael-0acf4 avatar May 03 '24 22:05 michael-0acf4

@CodiumAI-Agent /review

zifeo avatar May 05 '24 14:05 zifeo

PR Review 🔍

⏱️ Estimated effort to review [1-5]

4, due to the complexity and size of the PR, which includes multiple new modules, complex logic, and integration with existing systems. The PR introduces a new Python generator and modifies the core generator handling logic, requiring careful review to ensure correctness and maintainability.

🧪 Relevant tests

No

⚡ Possible issues

Possible Bug: The GeneratorHandler struct and its method handle might not correctly handle errors from the generator functions. The current implementation directly calls the function and returns its result, but there is no error handling or logging in case of failures.

Performance Concern: The generate_target function now iterates over all generators to find a match for each generation request. This could be inefficient if the list of generators grows large.

🔒 Security concerns

No

Code feedback:
relevant filelibs/metagen/src/lib.rs
suggestion      

Consider implementing error handling or logging within the GeneratorHandler::handle method to improve reliability and debuggability. [important]

relevant line(*self.f)(value)

relevant filelibs/metagen/src/lib.rs
suggestion      

Optimize the generator lookup in generate_target by using a HashMap or another efficient data structure to map generator names to their handlers, instead of iterating over a list. [important]

relevant line.find(|g| g.name.eq(gen_name))

relevant filelibs/metagen/src/mdk_python/mod.rs
suggestion      

Add error handling for potential failures in the new method of PythonGenerator, especially considering the validation of configuration which might throw errors. [important]

relevant lineconfig.validate(&())?;

relevant filelibs/metagen/src/mdk_python/mod.rs
suggestion      

Consider adding more specific error messages or custom error types to improve the debugging experience and error traceability in the visit_type function. [medium]

relevant line_ => bail!("Unsupported type {:?}", tpe.type_name()),

CodiumAI-Agent avatar May 05 '24 14:05 CodiumAI-Agent

Walkthrough

The recent updates across various modules enhance Python code generation capabilities, introduce new type handling functionalities, and improve type prioritization mechanisms. These changes include the addition of new structures and methods across Rust and Python files, aimed at better configuration management, dynamic type generation, and efficient template rendering for Python code generation.

Changes

File Path Change Summary
.../mdk_python/mod.rs Added configurations, generator struct, and methods for Python code generation.
.../mdk_python/static/main.py.jinja Introduced typed function templates in Python.
.../mdk_python/static/types.py.jinja Enhanced Struct class for advanced type handling.
.../mdk_python/utils.rs Developed utilities for managing type generation and prioritization.
typegate/tests/metagen/typegraphs/python.py Added functionality for defining typegraphs with specified structures.
.ghjk/deno.lock Updated URLs for various TypeScript modules.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

coderabbitai[bot] avatar May 06 '24 17:05 coderabbitai[bot]

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 73.85%. Comparing base (24963ce) to head (ad5fada). Report is 1 commits behind head on main.

:exclamation: Current head ad5fada differs from pull request most recent head 6fe428a

Please upload reports for the commit 6fe428a to get more accurate results.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #707      +/-   ##
==========================================
+ Coverage   73.52%   73.85%   +0.32%     
==========================================
  Files         120      120              
  Lines       14142    13777     -365     
  Branches     1416     1393      -23     
==========================================
- Hits        10398    10175     -223     
+ Misses       3717     3576     -141     
+ Partials       27       26       -1     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar May 06 '24 17:05 codecov[bot]

Oh man. Am I glad to see the test-full timeout happening on this PR as well. I was loosing hair over here thinking it was the deadlock bug.

Yohe-Am avatar May 08 '24 21:05 Yohe-Am

Looks like something wrong with serialization of python modules looking at the logs of runs on both PRs.

Yohe-Am avatar May 08 '24 21:05 Yohe-Am