guidance icon indicating copy to clipboard operation
guidance copied to clipboard

[Feature] Allow `@guidance` to decorate methods

Open hudson-ai opened this issue 1 year ago • 9 comments
trafficstars

This PR extends the @guidance decorator to work with methods.

I believe this will be invaluable when defining grammars that are implemented as several guidance functions that call each other, particularly if they need to share common parameters. I think that rewriting json as a class will be a good test case for this, and I expect to be able to simplify the implementation with this new machinery as well as enable some otherwise-tricky shared parameters like separators=(", ", ": ").

To achieve this:

  • I introduce a GuidanceFunction type that is returned when the decorator is called on a function.
  • GuidanceFunction uses __get__ descriptor magic to ensure that the function is bound to an instance before being wrapped by the decorator's actual implementation (_decorator).
  • Bound GuidanceFunctions are given a GuidanceMethod type

Side note/todo: _guidance.pyi is now not strictly correct, but it still gives helpful type hints. It's most-wrong in the case where a user guidance-decorates a method but tries to call it from the class rather than an instance. This case should be broken anyway, but I need to think some more about how to give the user a good error message. IMO, that can wait for a follow-up PR.

hudson-ai avatar Sep 23 '24 19:09 hudson-ai

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 96.90722% with 3 lines in your changes missing coverage. Please review.

Project coverage is 63.45%. Comparing base (63bfd9f) to head (612a810).

Files with missing lines Patch % Lines
guidance/_guidance.py 97.33% 2 Missing :warning:
guidance/_utils.py 95.45% 1 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

:exclamation: There is a different number of reports uploaded between BASE (63bfd9f) and HEAD (612a810). Click for more details.

HEAD has 69 uploads less than BASE
Flag BASE (63bfd9f) HEAD (612a810)
137 68
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1035      +/-   ##
==========================================
- Coverage   71.39%   63.45%   -7.94%     
==========================================
  Files          63       63              
  Lines        4670     4731      +61     
==========================================
- Hits         3334     3002     -332     
- Misses       1336     1729     +393     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Sep 23 '24 19:09 codecov-commenter

I'm unsatisfied with the behavior of caches here -- they will prevent instances from being garbage collected, and cache entries themselves won't get cleared. Need to tinker with that a bit more...

hudson-ai avatar Sep 24 '24 15:09 hudson-ai

I'm unsatisfied with the behavior of caches here -- they will prevent instances from being garbage collected, and cache entries themselves won't get cleared. Need to tinker with that a bit more...

Maybe use weakref? The cached objects can then live and be used for a while, but will be cleared during a garbage collection if there is too much memory pressure.

paulbkoch avatar Sep 24 '24 16:09 paulbkoch

I'm unsatisfied with the behavior of caches here -- they will prevent instances from being garbage collected, and cache entries themselves won't get cleared. Need to tinker with that a bit more...

Maybe use weakref? The cached objects can then live and be used for a while, but will be cleared during a garbage collection if there is too much memory pressure.

My thoughts exactly. Woke up this morning in a cold sweat mumbling to myself "...the garbage collector!" (joking).

Notice that there are essentially two levels of caches here -- one on the decorated function itself which is needed for "grammar node reuse" that wraps the unbound function (note it will receive the self argument and use it as part of the cache key), and one that sits on the unbound GuidanceFunction's __get__, which is needed to ensure that myinstance.func returns the same GuidanceMethod each time (so that we can detect/handle recursive calls).

My thought is to essentially use a weakref for the latter cache and then move the other cache to the bound method so that it doesn't need to hold a reference to the instance at all. Make general sense?

hudson-ai avatar Sep 24 '24 16:09 hudson-ai

Side note, but I'm getting uncomfortable with the fact that we're not setting maxsize anywhere on our LRU caches. Just asking for memory leaks...

hudson-ai avatar Sep 24 '24 16:09 hudson-ai

Non-side note: @paulbkoch how do you feel about the spirit of this PR in the first place? Do you think that allowing methods to be decorated is a good thing? 😄

hudson-ai avatar Sep 24 '24 16:09 hudson-ai

@paulbkoch I finally feel good about the caching. Feel free to take a look on your own time or I'd be happy to jump on a call and do some code review 😄

hudson-ai avatar Sep 26 '24 15:09 hudson-ai

Had a think on this -- I'm a fan of the concept. Python decorators are allowed to be used everywhere, so it seems quite reasonable to me to make the guidance decorator work on class methods too. I think there is organizational/bookkeeping utility from being able to do this in our own codebase, even if end users don't take advantage of it as much.

Harsha-Nori avatar Sep 28 '24 18:09 Harsha-Nori

@Harsha-Nori glad you're a fan of the idea. Before pushing this through, I want to have a deeper conversation about what is or isn't cached, especially for the sake of recursive calls. Some subtle things going on there, even for vanilla guidance functions (non-methods). Definitely don't want to be adding unnecessary cognitive load for users.

hudson-ai avatar Sep 29 '24 04:09 hudson-ai

@Harsha-Nori @paulbkoch any objections to merging this?

Biggest "gotcha" I can really think of -- the self argument will be ignored in the determination of what does/does not constitute a "zero arg recursive call". I.e. modifying the state of self may have unexpected consequences to the naive user.

This PR does provide a proper way of handling this though: if the object implements a __hash__ method that reflects changing state on the instance, the above gotcha can be avoided.

hudson-ai avatar Oct 07 '24 17:10 hudson-ai

Merging on verbal review from @Harsha-Nori :)

hudson-ai avatar Oct 07 '24 20:10 hudson-ai