llama-cookbook icon indicating copy to clipboard operation
llama-cookbook copied to clipboard

Function or Tool calling

Open jayakumark opened this issue 1 year ago • 8 comments

🚀 The feature, motivation and pitch

Looks like Llama3 has capability to call Tools like google/bing search, Would be good to have an example script with prompt template for Llma3 function calling.

Alternatives

No response

Additional context

No response

jayakumark avatar Apr 18 '24 20:04 jayakumark

@jayakumark thanks for the opening this issue, we dont have support for function calling yet.

HamidShojanazeri avatar Apr 19 '24 04:04 HamidShojanazeri

Thanks, Was going through Zucks Podcast with Dwarkesh Patel and he mentioned this and was curious how it was done.

Mark Zuckerberg 00:20:42

For Llama- 2, the tool use was very specific, whereas Llama-3 has much better tool use. We don't have to hand code all the stuff to have it use Google and go do a search. It can just do that. Similarly for coding and running code and a bunch of stuff like that.

jayakumark avatar Apr 19 '24 13:04 jayakumark

@HamidShojanazeri I just want to know if function / tool calling is on the native pipeline. As we've already built our own finetune with that, but it would be easier to build off of an official / robust tool calling implementation from Meta.

avianion avatar Apr 28 '24 15:04 avianion

The feedback is in the radar of the team, but we have seen some promising results on OSS for llama3 70B out of the box. https://twitter.com/HamelHusain/status/1784769559364608222

HamidShojanazeri avatar May 07 '24 00:05 HamidShojanazeri

Yep you're right! It does! Would be nice to have a more formal, polished fine tune though.

I think that and long context are on everyones radar in the OSS community.

avianion avatar May 07 '24 00:05 avianion

Sure, for long context we are working on research recipe here, feel free to take a look its still WIP.

Can you please clarify a bit what do you like to see as an example/ case study for func calling? @avianion

HamidShojanazeri avatar May 07 '24 00:05 HamidShojanazeri

Yes, in terms of long context, there are some rope theta possibilities as well as the above algorithm, but of course an official long context model from Meta would be nice. Typically if a model has a context of 8k, it will perform well and have great recall up to 4k. If Meta released an official 64k or 128k model, I would trust that it would perform very well to about half of that conext length.


Function Calling. My companies specific use case is data analysis, so we first have to fine tune the model to recursively be able to do function calling, and then on top of that fine tune that model for a specific use case or API.

What would be useful is if a model came with native tool calling (like the Mistral 8x22b for example). In those terms, I am talking about being able to basically support the entire json schema in terms of function calling.

https://json-schema.org/

So enums, anyOf, anyOr. Nested fields etc. And adherence to form. One of the difficult things about llms is getting deterministic outputs from them, or structured outputs. Function calling makes this easier.

The key points are

  1. Being able to use some kind of tool calling or support for a tool calling syntax out of the box
  2. Adhering to a schema for creating REST api calls, as defined by an OpenAPI or Json Schema specification
  3. Ability to express intent to call multiple tools at a time
  4. Ability to natively force certain tool calls

As a native feature I believe that long context is much more important as it opens up many capabilities, but function calling would be a great nice to have also

avianion avatar May 07 '24 00:05 avianion

@HamidShojanazeri https://docs.mistral.ai/capabilities/function_calling/

SOmething like this would be real nice.

avianion avatar May 07 '24 15:05 avianion

@avianion Thanks for the feedback above!

The 3.1 model(s) support function calling out of the box, you can find more details on MODEL CARD page, please re-open issue incase you have more Q(s)

init27 avatar Aug 18 '24 02:08 init27