TypeChat icon indicating copy to clipboard operation
TypeChat copied to clipboard

Design Meeting Notes (2023-11-06)

Open DanielRosenwasser opened this issue 2 years ago • 0 comments
trafficstars

Python and TypeChat

  • Still thinking about Pydantic as a basis.
  • pydantic-core specifies the built-in discriminators for validators
  • Seems feasible to generate TypeScript from these. Either over JSON schema or directly over data structures.
    • What about custom validators/serializers? Can't handle those with something custom.
  • Would be weird to tell users they have to be on a fixed version of Pydantic.
  • Anecdotally, have had good results with JSON schema (or more specifically, YAML versions of JSON schema).
    • YAML seems to do really well...
    • As well as TypeScript as a spec language? (check back here)
  • What workflow could we have here?
    • Start with a YAML-authored JSON schema.
    • Have a proof-of-concept of using kwalify
  • The good part of these schemas is that they can specify more than built-in annotations for types.
  • So specify in
  • 3 concerns
    • Spec language for language models (succinct, few tokens, familiar to recent LLMs)
    • Validation expressivity (you can say "it's a zip code" or "it's an email address").
    • Developer UX (end-to-end, you have a pleasant authoring language, type-checking, auto-complete, etc.)
  • Tied to that are the following:
    • What does a developer write?
    • What does an LLM see?
    • What
  • Be aware - there's a distinction for errors committed by an LLM versus errors committed by an end-user.
    • If a user says "my zip code is abcdefg", then that's a user error, not a language model error.
  • Another example - TypeChat Programs in Python
    • Top level functions exported.

      def add(x: float, y: float): float
      def sub(x: float, y: float): float
      # ...
      
  • Nothing seems to work as great as TypeScript for LLMs.
    • Lightest on tokens, most familiar.
  • Okay, but what's the authoring format? What do you do here? What if you need to generate types on the fly?
  • How will we solve the programmatic case in the TypeScript world?
    • We don't have a perfect solution right now. Maybe rely on libraries like Zod?
    • What's that going to have to look like? You say that something is a string, but then it's generated on the fly.
    • How does that get there?
  • Do these string unions/enums actually matter? Maybe for discriminated unions, but maybe not for items in a database?
  • What are these supposed to look like?
    • It may be best to insert these into comments.
  • So what would we do with Python?
  • We really really want to see what the accuracy is between the TypeScript and Python forms.
    • If it's not accurate, we need to see if we can convert it into TypeScript.

DanielRosenwasser avatar Nov 10 '23 01:11 DanielRosenwasser