Design Meeting Notes (2023-10-23)

Open DanielRosenwasser opened this issue 2 years ago • 1 comments

trafficstars

Possible Topics

Issue tracker
Library integrations
OpenAI functions
Formal representations
TypeChat Programs
Other languages (e.g. Python and C#)
Other features

Issue Tracker Maintenance and Community Engagement

We had a pause - what happened?
- Vacations, explorations with internal teams (e.g. Copilot implementations), etc.
- Direct discussions with users, but took us away from GitHub for a bit.
Where are we now?
- Some TypeChat ideas prototyped in C# - https://github.com/microsoft/typechat.net
  - Partially to prove out integration with C# and Semantic Kernel.
  - Some long-term ideas prototyped - but want to come back to parity with the TypeScript implementation of TypeChat.
- TypeChat ideas also prototyped in Python - https://github.com/DanielRosenwasser/Pypechat
Still want more blog posts, want to have a video explainer - seeing is believing.
Plan to do a sweep over issues and PRs.

TypeChat and Orchestrators

Things like Semantic Kernel, langchain, etc.
Currently exploring how these can be integrated - loose ideas at this moment?
Want to be able to find where these can complement each other, integrate better, etc.
- Planners based on TypeChat's JSON Programs

OpenAI Functions

https://github.com/microsoft/TypeChat/issues/45

OpenAI functions are one function at a time.
Described via JSON schema.
There's a function role that fits within a conversation.
Fine-tuned - not guaranteed to get schema-conforming data (nor even well-formed data!).
Is there a lot of usage?
- There's a lot of excitement, but we haven't yet spoken with many users.
So why not just use the TypeChat approach here? Either TypeChat JSON validation or TypeChat JSON programs?
- We believe one subsumes the other - TypeChat being cross-model with type-checked validation is more robust.
- Could plug in your favorite schema validator to do this technically, right?
- Anecdotally, TypeChat performs very very well. To be honest, a lot better in our experience.
  - We're missing evidence we can show to the outside world though.
Do we have any insight into long-term plans with OpenAI functions?
- Not yet, we would love to discuss further with these teams.
Conclusion?
- We don't yet think it makes sense to support directly - would love to better understand long-term plans from LLM providers like OpenAI.

Formal Representations for LLMs

What's that mean?
- Verifiable and repairable syntactically/semantically
Areas of investigation
- Best representations of...
  - specifications (e.g. TypeScript types, "JSON templates", JSON schema...)
  - return formats (e.g. JSON, YAML, code in specific languages)
- Is there a compact schema form that we can adopt/invent with high accuracy? It'd be easier to verify if we had something more compact than JSON schema.
  - But new languages = new toolchains. Picking a well-defined subset of a known language like TypeScript might be more successful.
- How do we make these work across languages?
What about a separate authoring format?
- "SchemaLite"?
What about that subset of TypeScript?
What about TypeSpec?
TypeScript versus JSON Schema?
- TypeScript really shines on discriminated unions.
- What's the best way to describe a discriminated union to an LLM? For data interchange, that's fundamentally how you describe polymorphism.

Further Evolution of JSON Programs/Planning/Scripting/Orchestration

Some feedback on programs is that they're cool, but too limited.
- Clever ways to enable some stuff like branching and iteration, but they don't always scale.
Models are being asked to produce an IR that is turned into another language, then interpreted.
The feedback loop from a type-checker is pretty removed.
- Hard problem with verification.
But we have concerns about sandboxing and guaranteed availability (i.e. keeping your host programs working in spite of the halting problem).
Plus, what if you have millions of functions, or methods on objects with thousands of types, etc.?
- And if we want to deliver plans with no hallucinations, we want to be able to summarize plans for humans too. So we want that...
- But how do you actually present this to a user?
- Just be able to provide transactions/undo? Commit/unroll?
Maybe there's some inspiration to be taken from languages like PowerShell, Tcl? Bring your own language features, build it up.

Multi-Agent/Multi-Schema/Routing Support

Dynamic Schema Generation from Data
Programmatic Schema Construction
- Dynamically populating structure and entities

Long-Term Features We'd Like to Tackle

Embeddings
Vocabulary
Multi-Schema
Routing
Multi-Model Infrastructure

Oct 31 '23 21:10 DanielRosenwasser

OpenAI functions are now parallel https://platform.openai.com/docs/guides/function-calling/parallel-function-calling and can output JSON more reliably.

I have also been implementing dynamic schema construction and generation from data. and also a JSON program to generate schemas.

One limitation currently is the generated JSON Programs are linear, LLM is not given a chance to reflect on intermediate results to readjust the plan.

Nov 07 '23 05:11 xumx

TypeChat TypeChat copied to clipboard

Design Meeting Notes (2023-10-23)

Possible Topics

Issue Tracker Maintenance and Community Engagement

TypeChat and Orchestrators

OpenAI Functions

Formal Representations for LLMs

Further Evolution of JSON Programs/Planning/Scripting/Orchestration

Multi-Agent/Multi-Schema/Routing Support

Long-Term Features We'd Like to Tackle

TypeChat
TypeChat copied to clipboard