[BOUNTY - $300] Support function/tool calling as laid out by OpenAI's API spec
I think it would be awesome if exo supported tool calling as laid out by OpenAI's API spec. It would allow people to get reliable structured generation via the Instructor library that is quite popular.
Something to note: What Instructor sends to OpenAI in its tools request array is an object that is a JSON schema object. This is NOT represented in the OpenAI docs, even though that's what it is.
If you don't allow using arbitrary JSON schemas as your tools in the request, it will not work with Instructor, and you will not be able to have more interesting schemas that go more than 1 top level object with only primitive fields. So in order to ensure that this works well, I think a requirement should be supporting JSON schemas as what tools takes in.
Assigned $300 bounty to this
@AlexCheema can I take on this task? I've got some ideas
If not since I'm the reporter, no worries
I can try to take this on!
I can try to take this on!
Assigned!
Please tag me here or on Discord if you have any questions or run into any bugs,
sg! I will start working on this on friday. I can't join the discord through the invite link on the repo, is there another link?
Hi, I’d love to take this on! I’ve worked extensively with function/tool calling for clients using OpenAI’s API, and I’ve hosted hackathons and written tutorials on it. Excited to work together and contribute to the project!
@AlexCheema
Can the google sheet be updated to denote @master-senses has this one? Almost grabbed it when looking at the Sheet (cc @AlexCheema )
@master-senses not sure how you're planning on doing it, but I was gonna implement it using Outlines: https://github.com/dottxt-ai/outlines Their implementation of bound generation results in no performance penalty (and can actually result in faster generation if implemented right). Main difficulty is figuring out which of the lower level functions in outlines to use since Exo works at the tensor level in order to do the distributed computing.
Just figured I'd give you some guidance in case you're figuring out where to start still. Excited to see what you create!
Hey thanks for this! This helps. I'm still in school and it's been kicking my ass, so I've been a bit late with finishing this up
No progress made after a month so opening this back up.
I'm happy to claim this one and take it on, if allowed
I'm happy to claim this one and take it on, if allowed
Assigned - good luck!
Sweet! Thanks, will get started on this
FYSA: Started work on this last weekend. Made some good headway on this, getting more familiar with the inference part of Exo, and figured out where in the code I should be implementing this. I've also taken a look at vLLM and how they handle structured generation from an API perspective in order to mirror that as closely as possible with the goal of making this as transparently interchangeable as possible.
I plan on starting implementation this weekend if I have time, or doing so while off for the holidays. Should I be opening a draft PR as I make progress? Or should I just open the PR when I'm done.
I'm releasing this bounty to whomever would like to pick it up. Some big changes at work has made this a lower of a priority for me, and I'd rather see it get done than collect the bounty. Happy to help whomever picks it up! If no one picks it up, and I have enough free time, I'll just open a PR for it. But in the meantime please assume that will not happen.
If you're looking for where to start in terms of how to implement the behavior, please refer to vLLM's implementation of the OpenAI structured generation spec. By mirroring their behavior and logic, you ensure that what you're doing will help the highest number of GPU poor AI devs take their apps to real clusters in the future since's it's pretty much the canonical "high performance" inference backend that has structured generation implemented properly to OpenAI's spec.
Is this bounty now free to give a go @vanakema @AlexCheema? I'm happy to take a look after the current PRs I am working on are complete
fine with me