llm icon indicating copy to clipboard operation
llm copied to clipboard

Markdown serialization/deserialization of `llm` conversations

Open davidgasquez opened this issue 7 months ago • 3 comments

Right now, llm allows conversations to be exported with llm logs --cid X. What would take for llm to be able to read the exported conversations back?

This issue is to figure out a way we could serialize and deserialize conversations and model configurations.

Having a text based format for persisting and sharing transcripts is super useful. It might be also a great way to replay conversations or edit/tweak responses.

We talked about it in Bluesky and format that seems to fit better our needs is something like the following code block.

---
model: gpt-4
temperature: 0.7
system: |
  You are a helpful bot interested in the weather.
---

<!-- role: user option:another -->
Could you show me the weather forecast for Paris?
<!-- /role: user -->

<!-- role: assistant --> 
Sure! Here’s the forecast for Paris…
<!-- /role: assistant -->

This looks good when when rendered as a Gist and also allows for arbitrary configurations at the response level.

A similar thing could be archieved with headers.

---
role: user

prompt

---
role: assistant

reply

Finally, an llm conversation file could be played back with llm filechat filename.md or similar and a new response would be appended at the end of the file following the choosen format.

davidgasquez avatar Apr 26 '25 13:04 davidgasquez

I like this idea a lot. The markdown export is already furiously useful and I share them all the time (usually as gists). Having it work as a round-tripable format has some very interesting knock-on effects:

  • I can share a conversation with you and you can continue it yourself
  • Conversations that started in one model could be replayed/continued in another

Plus it encourages a general goal I have for LLMs which is to get people to share more of their prompts!

simonw avatar Apr 26 '25 13:04 simonw

I want to ditch the -u/--usage option and always include that information.

Options aren't visible in the markdown yet, we should fix that.

My priority is that the default export format looks good when rendered, so I'm not keen on anything with too much visual junk.

We also need to make it unambiguous. Right now the sections within the markdown output are delimited by headings like ## Prompt - but this means if the LLM output includes conflicting headings it could break a round-trip parser.

I quite like the idea of invisible HTML comments to fix this:

## Prompt

<!-- prompt-starts: 7363949d-448e-4b98-b580-1f6133396f2f -->
User prompt shows here.
<!-- prompt-ends: 7363949d-448e-4b98-b580-1f6133396f2f -->

I dropped a UUID in there that would be generated for each document, which means that even those comments themselves would be safe to include in the data section of the document (the parser would stick with the first UUID from a trusted section of the document.)

simonw avatar Apr 26 '25 13:04 simonw

Fun twist: I'm planning on having LLM support conversation trees soon:

  • #938

Representing those in Markdown would be fun! Might be an argument to use > quotation blocks as part of the format, since those render nicely as a nested collection.

We could punt on that entirely though and say that the Markdown output is only ever a linear sub-tree.

simonw avatar Apr 26 '25 13:04 simonw