TypeChat icon indicating copy to clipboard operation
TypeChat copied to clipboard

Design Meeting Notes (2023-10-30)

Open DanielRosenwasser opened this issue 2 years ago • 0 comments
trafficstars

Agent Patterns

  • Related to work from AutoGPT and AutoGen
  • Agents "specifying" other agents
    • "Matryoshka agents"
  • Agents passing messages back and forth.
    • e.g. actor-critic pattern - an actor agent responds, a critic agent questions the actor to refine answers or evaluate answers.
  • Can we extend TypeChat schemas so that they're full descriptions of agents?
  • Can't much of this be achieved through comments in our schemas?
    • Maybe
  • If we can build these out as examples, that would be useful.
  • What are the criteria by which people evaluate success of these agents?
    • Critic agents can evaluate in a very restricted way - "respond in 'good'/'bad'"
  • Theres a view of this which is like reinforcement learning, where actor and critic just feed off each other. Where's the schema?
  • Stepping back - projects like AutoGPT and autogen are in Python - what are we even talking about without a Python version of TypeChat?

TypeChat in Python (and .NET, and revisiting the Programs approach in TypeScript)

  • We have experiments in Python (e.g. Pypechat)

  • People use Pydantic a lot for validation. Sending Python doesn't work all that well as TypeScript as a spec language, JSON schema dumps of Pydantic doesn't work as well as TypeScript as a spec language.

  • Could we generate TypeScript from something like Pydantic data structures?

  • Libraries like Pydantic also have certain kinds of validation beyond what any static type system can encode. We can encode those in comments.

  • We could do the same thing with something like Zod as well.

  • We don't know how well libraries like Pydantic work on discriminated unions and collections of literals.

  • One of the nice things about these solutions is that for dynamic schema generation (i.e. "my information is all in a database, generate a schema out of that") can be achieved because they all have programmatic APIs.

  • Using a runtime type validation library sounds nice, but what about TypeChat programs?

    • Type-checking between steps is not all that simple.
  • Have to extend Pydantic in some way to describe APIs

  • Something where each ref is inlined and type-checked in that manner.

  • Will that work? What about the csv example? Table types are basically opaque, but exist across values.

  • Problem with this approach and opaque values (things that can't be JSONy) is... well, let's dive into the current programs approach.

  • Given the following API...

    interface API {
        getThing(...): Thing;
        processStuff({ thing: Thing, a: ..., b: ... }): ...;
    }
    

    for an intent, a language model will generate something like...

    {
        "@steps": [
            ...,
            {
                "@func": {
                    "name": "...",
                    "args": [
                        {
                            "thing": { "@ref": 0 },
                            "a": "...",
                            "b": "..."
                        }
                    ]
                }
            }
        ]
    }
    
    • Can imagine a runtime validator substitute the { "@ref": 0 } with the earlier value.
    • If you take a substitutive approach, all you end up with is pure JSON.
    • Can't have an API where you have factories for opaque types.
  • If we did this for Python and .NET, we would probably do the same for TypeScript as well.

  • Does this validation approach work? Don't you need an exemplar value for each return type?

  • Forget Python, how does this work with up-front validation?

    interface API {
      getThing(): { x: number, y: number };
      eatThing(value: { x: number, y: number }): void
    }
    

    could generate

    {
        "@steps": [
            {
                "@func": {
                    "name": "getThing",
                    "args": []
                },
                "@func": {
                    "name": "eatThing",
                    "args": [{ "@ref": 0 }]
                },
            }
        ]
    }
    

    which turns into...

    {
        "@steps": [
            {
                "@func": {
                    "name": "getThing",
                    "args": []
                },
                "@func": {
                    "name": "eatThing",
                    "args": [{"@func": { "name": "getThing", "args": [] } }]
                },
            }
        ]
    }
    
    • Well not quite, you would serialize the result of the first step and send it right into eatThing
  • But that's not the same thing that's in TypeChat today - this doesn't do up-front validation, it validates at each step of evaluation.

  • We might be able to figure something out with runtime type validation libraries to do up-front validation.

  • Is up-front validation important?

    • We do think so, we believe that validation and summarization before executing is something we should strive to provide.
  • StuffArg = {
      thing: Thing,
      a: number,
      b: number
    }
    
    interface API {
      eatThing(value: StuffArg): void
    }
    
    • You'll need some sort of Pydantic/Zod object to describe this...
    • What is the exemplar value for each?
    • If you use nominal equivalence, it's easier to provide some basic checking here.
    • That might be enough?
  • But TypeChat programs permit some amount of structural construction - object literals etc.

    • Kind of at odds with this concept of "nominal only".
    • Could say all refs have to be nominal.
  • Could come up with a very minimal type-checker across APIs.

  • How do you deal with the divergence between how this type-checks versus how it all type-checks in the behind-the-scenes implementation of the API.

    • Impedance mismatch problem - not limited to "nominal versus structural". Does this type-checking strategy in TypeChat support subtyping?
  • We will need to prototype this out a bit.

    • We'll likely focus on Python here first just to prove it out and get a proof-of-concept.
    • Need to see what the Pydantic and Zod and the like provide in the way here.

DanielRosenwasser avatar Nov 10 '23 01:11 DanielRosenwasser