pytype icon indicating copy to clipboard operation
pytype copied to clipboard

Getting pytype to treat a function as having a different function signature

Open theahura opened this issue 3 years ago • 6 comments

This is sort of an obscure use case, but I'm wondering if pytype supports this.

I have a worker library that contains functions like:

registry = {}

def foo(arg1: int, arg2: Sequence[Dict[Text, Any]]):
  run_foo()
  
registry['foo'] = foo

while true:
  message = json.loads(queue.get_message())
  registry[message['name']](*message['args'])

I have an API server library that looks something like this:

def enqueue(fn_name: str, fn_args: Any):
  message = json.dumps({'name': fn_name, 'args': fn_args})
  queue.send_message(message)

The API server and the worker binaries have a messenger queue between them, like RabbitMQ or SQS or whatever. But the API server can import the worker as a library, and thereby get all of the function signatures of the worker.

My question: is there a way to tell pytype that when the API server calls enqueue('foo', args: {'arg1': 5, 'arg2': [{'hello': 'world'}]) it should run type checking as if the API server were calling foo directly?

One possibility that I think would work would be to wrap every worker function with something that checks an env variable, and only runs the env variable if the function is being called from the worker, for e.g.

def foo(arg1: int, arg2: Sequence[Dict[Text, Any]]):
  if ENV == 'WORKER'
    run_foo()
  # Do nothing if called from any other env

and then in the enqueue:

import workerlib

def enqueue(fn_name: str, fn_args: Any):
  workerlib.registry[fn_name](*fn_args)  # Shouldnt actually call the fn because ENV is not WORKER
  message = json.dumps({'name': fn_name, 'args': fn_args})
  queue.send_message(message)

This would work because the API server actually does call foo directly, and so pytype should treat that as any other function call...but this seems inelegant.

Is there a different solution that doesn't require calling the function? Or alternatively if I did something like overwrite the signature programmatically (https://stackoverflow.com/a/33112180/3269537)?

theahura avatar Jan 02 '22 23:01 theahura

Another possible way to do this is to try something like:

import types


def copy_func(f, name=None):
  fn = types.FunctionType(f.__code__, f.__globals__, name or f.__name__,
                          f.__defaults__, f.__closure__)
  # in case f was given attrs (note this dict is a shallow copy):
  fn.__dict__.update(f.__dict__)
  return fn


def foo(arg1, arg2):
  print(arg1, arg2)


def dummy(*args, **kwargs):
  pass


x = copy_func(foo)
x.__code__ = dummy.__code__

x(1, 2)  # Succeed pytype, and does nothing at runtime.
x(1, 2, 3)  # SHOULD fail pytype (expected 2, got 3), does nothing at runtime.
foo(1, 2)  # Works as normal. 

The problem with the approach above is the second call x(1, 2, 3) does not actually cause pytype to fail. Is there a way to get pytype to keep track of the function sig through the copy_func call?

theahura avatar Jan 03 '22 04:01 theahura

this is unfortunately not possible without an overlay supporting your enqueue function. overriding the signature at runtime won't work either because pytype statically analyses the bytecode, and will therefore pick up on the signature of the enqueue function rather than that of foo.

one somewhat drastic method i can think of is to use code generation to end up with something like

@api_call
def foo(arg1: int, arg2: str):
  pass

where you use an external code generator to write the function signatures into a client library, and a decorator to populate the body of the method with a call to enqueue based on introspecting the signature.

martindemello avatar Jan 05 '22 04:01 martindemello

Maybe a silly solution, but can I just lie about the type? I did something like this:


from typing import TypeVar

T = TypeVar('T')


def foo(arg1: int):
  return arg1


def dummy(*args, **kwargs):
  pass


def just_sig(f: T) -> T:
  # Basically we just lie about the return type here...
  return dummy

x = just_sig(foo)
x(1, 2)  # Succeed pytype, and do nothing at runtime.
x(1, 2, 3)  # Fail pytype (expected 2, got 3), do nothing at runtime.
foo(1, 2)  # This should work as normal.

and I think it worked, in that the three test cases at the bottom all did what I expected them to

theahura avatar Jan 05 '22 04:01 theahura

you could, but that seems like a lot of boilerplate too - are you going to do it once per function you call? i might be misunderstanding what you are trying to do here. how would you actually use the above x() in your code?

martindemello avatar Jan 05 '22 04:01 martindemello

Well at this point I think I could just call the just_sig function like any other function. So I could do something like:

@add_to_registry
def foo(arg1, arg2):
  do_foo
  
registry = []  # a list of fns populated by the decorator fn below

def add_to_registry():
  def wrapped(fn): 
    registry.append(fn)
    return fn

tasks = {}
for fn in registry:
  tasks[fn.__name__] = {
    fn: fn
    sig: just_sig(fn)
  }

def enqueue(fn_name: str, fn_args: Any):
  tasks[fn_name]['sig'](*fn_args)  # Calls the dummy fn instead of the real one
  message = json.dumps({'name': fn_name, 'args': fn_args})
  queue.send_message(message)

I think this minimizes boilerplate because just_sig() takes in a generic. I can define it once and then use it for any other function that comes later, and pytype will think that whatever function gets passed in has the same signature as what is returned.

(At least, I think so! I could be way off base here...)

theahura avatar Jan 05 '22 04:01 theahura

ah. i see. yes, that might indeed work if pytype can track all the string literals you are using to access the tasks dictionary correctly, though i wouldn't be surprised if it didn't. definitely worth experimenting with though.

martindemello avatar Jan 05 '22 05:01 martindemello