python-lenses icon indicating copy to clipboard operation
python-lenses copied to clipboard

Debugging code involving lenses is hard

Open Gurkenglas opened this issue 1 year ago • 10 comments

The call stacks end up super deep and the order of operations strange. Have you considered a refactor where lens.....get() is compiled into a function block where .F(getter) corresponds to string2 = getter(int5), .Recur(Foo) corresponds to for foo in recur(Foo, bar):, etc.? If the only reason against is months of tedious refactoring, say so - they might be the kind that AI tools these days solve.

Gurkenglas avatar Apr 16 '23 08:04 Gurkenglas

I agree that the call stacks are monstrous. I'm not really sure how that refactor would work; could you give an example?

ingolemo avatar Apr 16 '23 09:04 ingolemo

Here's ~what code of mine I changed since I posted this issue. This change made debugging much easier, so this should be happening automatically behind the scenes. (Yes, this is not lawful use of optics (it's fine if I only get nice debugging for lawful uses) and the code blocks are unequal and both wrong.)

def shrinkAttr(self, attr, regex):
    def asserter(x):
        assert re.search(regex, x.__dict__[attr])
        del x.__dict__[attr]
    return self & lenses.Iso(asserter, lambda x: x)
lenses.ui.BaseUiLens.shrinkAttr = shrinkAttr

def shrink(x):
    return lenses.bind(x).Recur(openai.OpenAIObject
        ).shrinkAttr("api_base_override", "None"
        ).shrinkAttr("api_key", "sk-\w{48}"
        ).shrinkAttr("api_type", "None"
        ).shrinkAttr("api_version", "None"
        ).shrinkAttr("openai_id", "chatcmpl-\w{29}"
        ).shrinkAttr("organization", "user-\w{24}"
        ).shrinkAttr("typed_api_type", ".*"
        ).shrinkAttr("id", "chatcmpl-\w{29}"
        ).shrinkAttr("object", "chat_completion"
        ).shrinkAttr("created", "\d*"
        ).shrinkAttr("model", "gpt-4-0314"
        ).get()
def shrinkOne(x):
    def attr(name, regex):
        value = (str)(getattr(x,name))
        assert re.search(regex, value)
        delattr(x, name)
    attr("api_base_override", "None")
    attr("api_key", "sk-\w{48}")
    attr("api_type", "None")
    attr("api_version", "None")
    attr("openai_id", "chatcmpl-\w{29}")
    attr("organization", "user-\w{24}")
    attr("typed_api_type", ".*")
    attr("id", "chatcmpl-\w{29}")
    attr("object", "chat_completion")
    attr("created", "\d*")
    attr("model", "gpt-4-0314")
def shrink(openaiobject):
    d = openaiobject.to_dict()
    return lenses.bind(d).Recur(dict).modify(shrinkOne)

Gurkenglas avatar Apr 16 '23 09:04 Gurkenglas

The OP example fleshed out: lens.Recur(Foo).GetAttr("id").F((str)).collect() could compile to:

def lensRecurFooGetAttrIdFStrCollect(arg : Bar) -> List<str>:
  collect : List<str> = []
  for recurFooArg : Foo in recur(Foo, arg):
    getAttrId : int = recurFooArg.id
    fStr : str = (str)(getAttrId)
    collect.append(fStr)
  return collect

Gurkenglas avatar Apr 16 '23 10:04 Gurkenglas

Yes, but the problem is that I don't know how to get from here to there. All the functions in the call stack are there to provide the abstraction necessary to allow all the lenses tools to compose together. Maybe some of those layers can be eliminated with clever tricks, but short of dynamically generating python code and then exec-ing it, I don't think the library can get it that clean.

Or is that what you're proposing here: exec? That would definitely be a big refactor…

ingolemo avatar Apr 16 '23 10:04 ingolemo

Yes, that's what I'm proposing! That your syntax is lens.Recur(Foo).GetAttr("id").F((str)).collect()(arg) can be read as a hint that the lens.Recur(Foo).GetAttr("id").F((str)).collect() part can be precomputed, and that function values don't come with "string that would produce that function" is in general silly :D

When I look at the type of a lens like lens.Recur(Foo).GetAttr("id").F((str)), a voice in my head tells me that something already does the work required to produce the readable codeblock, in order to compute a type that is an impoverished version of that same codeblock.

Gurkenglas avatar Apr 16 '23 10:04 Gurkenglas

I thought lenses were confusing enough without trying to jam dynamic recompilation into the middle of everything.

I will admit that I find such a challenge tempting. I'll see if I can find time to do some experiments. No promises.

ingolemo avatar Apr 16 '23 10:04 ingolemo

Fallen at the first hurdle. Python doesn't show the source code when printing tracebacks from dynamically generated code because it looks up the filename and reads the file. For example, if you run exec('1/0') then you'll notice that 1/0 doesn't appear anywhere in the output. I can't find any hooks to make this work. A large traceback that references real code is better than a short traceback with no information.

I'm open to any other ways to make the tracebacks easier to read if anyone has ideas.

ingolemo avatar Apr 16 '23 16:04 ingolemo

code = """
def f(x):
    y = x+2
    print(5)
    1/0
    return y*x
"""
with open("/tmp/codef.py", "w") as f:
    f.write(code)
co = compile(code, "/tmp/codef.py", "exec")
exec(co)
f(2)

produces this in VSCode: Screenshot from 2023-04-16 18-38-53

Gurkenglas avatar Apr 16 '23 16:04 Gurkenglas

I don't really want to write to a temporary file on every invocation of a lens.

ingolemo avatar Apr 16 '23 16:04 ingolemo

See it as just-in-time compilation! Have you ever written code such that you couldn't cache the compiled version per calling code location with length_of_tmp_code in O(length_of_calling_code)?

Gurkenglas avatar Apr 16 '23 17:04 Gurkenglas