langstream
langstream copied to clipboard
Prompt templates should be pluggable
There's a growing list of prompt templates that LangStream should be able to leverage. For example, FLARE: https://github.com/devinbost/langchain/blob/f35db9f43e2301d63c643e01251739e7dcfb6b3b/libs/langchain/langchain/chains/flare/prompts.py
(Demo with CassIO is here: https://colab.research.google.com/drive/1FGMmmkvy3PH7gWQdBBr05HFLTsM6PRU6)
and many other prompts are available in this directory: https://github.com/devinbost/langchain/tree/b3a8fc7cb17170b8a272a0a1366b0f18d8f7ab4a/libs/langchain/langchain/chains (Look at the files named "prompt.py" in the subdirectories.)
If we can leverage these pluggably, then we won't need to spend so much time trying to keep up as new research results in additional prompt techniques.
I think that it is too early to standardize a mechanism to handle prompts, loops and so on.
We need more feedback from the community.
LangChain is trying to build some model around this topic, but we can wait to see this taking more shape.
It would be great to see how people are trying to build this patterns here with LangStream.
Moving from a flat python file to a scalable deployment with async batch processing will be fun and interesting.
After working a little bit on sample applications and demos I realise that we need to at least make "prompts" a first class object, like we have agents, assents and resources.
Also one big problem is the the prompt format highly depends on two factors:
- the LLM model you are using (some LLMs are very "chatty" and they interact with an human, other LLMs are more formal, like the "instruct" models)
- the LLM API (some LLMS allows you to report some context and examples (like Vertex), other require a list of messages with "roles" (like OpenAI), others only require a simple string (like OpenSource models from Ollama)
My proposal is to add a new file type that contains prompts and in the chat-completion and text-completion agents you can refer to a prompt by id.
Something like:
prompt1.yaml
prompt:
- id: qa-chatbot-openai-sample-1
templating-language: mustache
messages:
- role: system
text: |
.... .{{documents}}..
- role: user
text: {{value.question}}
prompt2.yaml
prompt:
- id: qa-chatbot-ollama-sample-1
templating-language: mustache
examples:
.....
text: |
.... .{{documents}}..
prompt3yaml
prompt:
- id: qa-chatbot-vertex-sample-1
templating-language: mustache
examples:
.....
messages:
- role: system
text: |
.... .{{documents}}..
- role: user
text: {{value.question}}
We can also go futher and allow prompts to be read from an external system, like a database
@eolivelli since the only place you use the prompt is the completion, and you likely have only one completions agent in the application, what is the value of putting it out of the agent configuration? the only pros I see is that it could be easier to make it dynamically loaded from an external system, which I think it's not a common usage
Can you expand more on what would the benefit of this change ?
Apart form making it (in the future) loadable from an external system, I see that putting the prompt into separate files will help in this cases:
Developers should focus on the prompt and not on the boilerplate of the pipeline
Prompts are a core concept in a LLM based applications, if you strip them out of the pipeline file you can "focus" on the prompt, given that the pipeline probably is almost the same for a given type of application (Q/A Chatbot...)
You almost always need more than one prompt
Some applications (in really any application that you want to deploy seriously in production 😉 ) require multiple prompts (see the samples above), having to write them inside the pipeline files is pretty awkward and your pipeline files tend to become huge and not readable/maintainable