TypeChat
TypeChat copied to clipboard
Design Meeting Notes (2023-11-06)
trafficstars
Python and TypeChat
- Still thinking about Pydantic as a basis.
- pydantic-core specifies the built-in discriminators for validators
- Seems feasible to generate TypeScript from these. Either over JSON schema or directly over data structures.
- What about custom validators/serializers? Can't handle those with something custom.
- Would be weird to tell users they have to be on a fixed version of Pydantic.
- Anecdotally, have had good results with JSON schema (or more specifically, YAML versions of JSON schema).
- YAML seems to do really well...
- As well as TypeScript as a spec language? (check back here)
- What workflow could we have here?
- Start with a YAML-authored JSON schema.
- Have a proof-of-concept of using kwalify
- The good part of these schemas is that they can specify more than built-in annotations for types.
- So specify in
- 3 concerns
- Spec language for language models (succinct, few tokens, familiar to recent LLMs)
- Validation expressivity (you can say "it's a zip code" or "it's an email address").
- Developer UX (end-to-end, you have a pleasant authoring language, type-checking, auto-complete, etc.)
- Tied to that are the following:
- What does a developer write?
- What does an LLM see?
- What
- Be aware - there's a distinction for errors committed by an LLM versus errors committed by an end-user.
- If a user says "my zip code is abcdefg", then that's a user error, not a language model error.
- Another example - TypeChat Programs in Python
-
Top level functions exported.
def add(x: float, y: float): float def sub(x: float, y: float): float # ...
-
- Nothing seems to work as great as TypeScript for LLMs.
- Lightest on tokens, most familiar.
- Okay, but what's the authoring format? What do you do here? What if you need to generate types on the fly?
- How will we solve the programmatic case in the TypeScript world?
- We don't have a perfect solution right now. Maybe rely on libraries like Zod?
- What's that going to have to look like? You say that something is a
string, but then it's generated on the fly. - How does that get there?
- Do these string unions/enums actually matter? Maybe for discriminated unions, but maybe not for items in a database?
- What are these supposed to look like?
- It may be best to insert these into comments.
- So what would we do with Python?
- We really really want to see what the accuracy is between the TypeScript and Python forms.
- If it's not accurate, we need to see if we can convert it into TypeScript.