kedro-plugins
kedro-plugins copied to clipboard
feat(datasets): Add limited `langchain` support for Anthropic, Cohere, and OpenAI models
Description
Adds limited support for langchain models.
This PR is a rough starting point for loading langchain API-based models.
The big issue here is langchain's model catalog. See the list here (just for chat models).
There's no way anyone could implement and maintain all of these.
Even if that was desirable, we can see from the CohereDataset example that there are going to be lots of details along the way that will make this task difficult.
Would love to see what the team thinks and if this is worth pushing forward!
Development notes
Adds four datasets for interacting with langchain models.
Checklist
- [x] Opened this PR as a 'Draft Pull Request' if it is work-in-progress
- [ ] Updated the documentation to reflect the code changes
- [ ] Added a description of this change in the relevant
RELEASE.mdfile - [ ] Added tests to cover my changes
Hey again @astrojuanlu! Excuse my slow reply, I was out for thanksgiving.
I did use this in a PoC. However, I only ever used the YAML api.
I'll push some python API examples.
Depending on #629, this may need to move to a contribution folder but I think this is mostly ready
Hi @ianwhale, thanks so much for your patience with this PR! We're about to launch our new experimental dataset contribution model, which basically means you can contribute datasets that are more experimental and don't have to have full test coverage etc here https://github.com/kedro-org/kedro-plugins/tree/main/kedro-datasets/kedro_datasets_experimental.
I think this PR with datasets would be a perfect first candidate to go into kedro_datasets_experimental. I don't think there's much else you need to do, other than move it to that directory.
A couple of thoughts that relate to the topic and we can consider them in future:
- When we worked with the langchain we found it convenient to work with
chains- that combinellmandpromtand provide a standardised interface to call the model with run-time parameters (prompt placeholders). So one can use different llms with the same interface. Example using latest langchain API: https://python.langchain.com/v0.1/docs/integrations/chat/anthropic/- We also came to dynamic model initialisation, in our case it can help users to switch between different models without need to add extra datasets (OpenAI, Cohere, Azure, etc) with just one
LangchainDataset.
Thanks @ElenaKhaustova for reviewing! I really like your ideas to improve this. I'd suggest merging this version for now and when we have some time or someone from the community can help out we can implement the improvements.
Docs don't show up though 😬 https://docs.kedro.org/projects/kedro-datasets/en/latest/api/kedro_datasets_experimental.html
I was still planning on polishing before merging, but then it was already merged. Maybe let the assignee/author complete it next time instead of merging as reviewer?
Maybe let the assignee/author complete it next time instead of merging as reviewer?
👍🏼