hamilton
hamilton copied to clipboard
[good first issue - advanced][Example] Create a dataflow modeling information extraction using an LLM
Write an example dataflow that uses Hamilton to model an information extraction task using an LLM.
For example:
- given an output schema
- given input text
- make a prompt that is sent to an LLM API (pick one)
- then write a function to validate the output
For inspiration you can look at Langchain's implementation.
The code for this example should end up under the /examples/LLM_Workflows directory.
Hi @skrawcz , I want to work on this issue. Can you guide me a little on what exactly needs to be done here ?
Hi @skrawcz , I want to work on this issue. Can you guide me a little on what exactly needs to be done here ?
Sure, if the following makes sense to you. If not we can see if there's another issue that would be a fit.
- Are you familiar with Hamilton? If not, I suggest going through the tutorial at www.tryhamilton.dev to understand what Hamilton does. That will help you understand the code to write.
- Are you familiar with LLM APIs? Do you know what a prompt is?
- A use case for this work might look like the following:
I have user feedback data about some product, and I want to extract some information from it. E.g. what the product the review is about, what the sentiment is, etc.
Does that make sense?
In terms of an analogous example - you can see LangChain's example which shows a schema, i.e. what we want to get back out, and an example sentence, and then the result.
i.e. given
Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.
it outputs:
[{'name': 'Alex', 'height': 5, 'hair_color': 'blonde'},
{'name': 'Claudia', 'height': 6, 'hair_color': 'brunette'}]
- LLMs aren't always good at following directions, so the example should show a check to parse & validate the output returned.
- So the task would be to encode the above into a Hamilton dataflow, or DAG. With the first deliverable being code that ends up in the examples/ directory. But longer term we'd ship this as part of the user contributed code library that will ship with Hamilton.
Hi @skrawcz , thanks for explaining the use case. Let me try it on a basic level and get back to you. I am taking inference from Knowledge Retrieval example under LLM workflows and I guess the core concept remains same.
Hi @skrawcz , thanks for suggesting LangChain, we can leverage the same API for this use case. On top of that, we can have a wrapper for encoding it into a Hamilton dataflow. But for implementing LangChain, we need an open_api_key, which I am unable to find. I am really curious where are you setting it to use OpenAi api in other examples. Can you please help me with that ?
thanks for suggesting LangChain, we can leverage the same API for this use case. On top of that, we can have a wrapper for encoding it into a Hamilton dataflow.
Just to be clear. We don't want to wrap langchain. We want to take a "chain" and reimplement it in Hamilton.
we need an open_api_key
Yep. You need to sign up for one. It comes with a few dollars free credit to create the API. E.g. sign up via openai.com. It also doesn't have to be openai if there's an alternative LLM API.
@skrawcz, can I work on it? Or is it a WIP?
@skrawcz, can I work on it? Or is it a WIP?
Yes it is open and you could take it. But, this could be a little bit of work to figure out how to write code. The output should look similar in style to this text_summarization example. It will require you to understand the langchain code and then translate it into how it might look in Hamilton. For a quicker task I would look at https://github.com/DAGWorks-Inc/hamilton/issues/284 or #410 .
@skrawcz, I am familiar with both OpenAI´s API and Langchain. Have tried them myself. So I would love to give it a try :)