NanoLLM Meta-Llama-3-8B-Instruct always reply: You are a helpful and friendly AI assistant.<|eot

trafficstars

Dear @dusty-nv ,

I'm trying the example code on web page: Function Calling.

I tried both Llama-2-7b-chat-hf and Meta-Llama-3-8B-Instruct. Seems Llama-2-7b-chat-hf is more reliable than Meta-Llama-3-8B-Instruct. Here are the replies when I'm using Meta-Llama-3-8B-Instruct:

hello You are a helpful and friendly AI assistant.<|eot_id|>

what is time current time? You are a helpful and friendly AI assistant<|eot_id|>

what's the current date? You are a helpful and friendly AI assistant<|eot_id|>

But even Llama-2-7b-chat-hf cannot invoke the giving bot_function each time. when I'm using Llama-2-7b-chat-hf, I have to specify the function name in prompt to increase the possibility to invoke the bot_function.

And how to pass arguments to the bot_function in a better and reliable way? thank you!

Jun 08 '24 01:06 UserName-wang

Hi @UserName-wang , I have found https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B model to be way better at function calling (it was finetuned for it), and showed in controlling HomeAssistant in the last research group meeting: https://www.jetson-ai-lab.com/research.html#past-meetings

I still need to update the documentation for this though, and well I also need to make more modifications to add bot functions at runtime, ect. These tuned models are able to pass the function arguments reliably.

Jun 08 '24 02:06 dusty-nv

Hi! Thank you for your quick reply!Is there any model on NVIDIA NGC we can use via API Key?在 2024年6月8日，10:39，Dustin Franklin @.***> 写道： Hi @UserName-wang , I have found https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B model to be way better at function calling (it was finetuned for it), and showed in controlling HomeAssistant in the last research group meeting: https://www.jetson-ai-lab.com/research.html#past-meetings I still need to update the documentation for this though, and well I also need to make more modifications to add bot functions at runtime, ect. These tuned models are able to pass the function arguments reliably.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

Jun 08 '24 03:06 UserName-wang

There is not yet, but I hope/think we see more of these function-tuned models coming out, let me know if you see or find anymore in the smaller range of LLMs!

Jun 08 '24 03:06 dusty-nv

Hi!I find out that Langchain OLlamaFounctoins can call tools and pass arguments reliably. But the response speed is very low. The invoke command takes nearly 5 seconds!在 2024年6月8日，11:58，Dustin Franklin @.***> 写道： There is not yet, but I hope/think we see more of these function-tuned models coming out, let me know if you see or find anymore in the smaller range of LLMs!

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

Jun 08 '24 04:06 UserName-wang

@dusty-nv what settings did you use in the Llamaspeak function calling video? I'm having trouble achieving similar performance, using and modifying the example script. I tried the NousHermes model, being careful to use (and toy with) their function-calling system prompt and chat-ml-tools template. It was pretty good at creating bare function calls, but failed in conversation of any complexity (hallucinating functions, failing to call functions, etc). Also, functions are not executed. I see that functions are passed to model.generate(), but when I pull the thread through the code, I'm not sure I see where they are being executed. In mlc._generate() around line 460?

function_next = function(stream.text)

Is the execution up to me, or is it automated? Just asking before I start reinventing a little wheel.

Aug 20 '24 18:08 bigrobinson

Hi @bigrobinson, I need to update the docs for NousHermes, which uses multi-turn templates for the tool calls and responses. If you are using plugins, you can use that by connecting a plugin that exposes tools to the last output channel of the NanoLLM plugin:

https://github.com/dusty-nv/NanoLLM/blob/3a28b7ce78f95519a8020502817883862ca0c360/nano_llm/plugins/llm/nano_llm.py#L31

You can also mimic how that plugin calls chat_history.run_tools() to make your own generation script instead.

Aug 21 '24 05:08 dusty-nv

NanoLLM
NanoLLM copied to clipboard

Meta-Llama-3-8B-Instruct always reply: You are a helpful and friendly AI assistant.<|eot_id|>

NanoLLM NanoLLM copied to clipboard

Meta-Llama-3-8B-Instruct always reply: You are a helpful and friendly AI assistant.<|eot_id|>

NanoLLM
NanoLLM copied to clipboard