deepgram-go-sdk
deepgram-go-sdk copied to clipboard
Agent API: Function calling
Based on the latest (v1.8.0) godoc, it looks like function calling is not implemented.
type FunctionCallRequestResponse struct {
Type string `json:"type,omitempty"`
FunctionName string `json:"function_name,omitempty"`
FunctionCallID string `json:"function_call_id,omitempty"`
Input map[string]string `json:"input,omitempty"` // TODO: this is still undefined
}
ref: https://pkg.go.dev/github.com/deepgram/[email protected]/pkg/api/agent/v1/websocket/interfaces#FunctionCallRequestResponse
A map[string]string type is not appropriate for some inputs, like this example in your docs:
{
"type": "FunctionCallRequest",
"function_name": "do_math",
"function_call_id": "7433439b-c4b6-4369-8ce8-4124f6a98a1d",
"input": { "numbers": ["2", "2"], "operation": "add" }
}
https://developers.deepgram.com/docs/voice-agents-function-calling#server-messages-sent-by-deepgram
My use case requires function calling. Is there a way for me to work around this limitation? I'm using agent.NewWSUsingChanWithCancel, which seems to be the main entry point. It looks like I could use common.NewWS instead if I implement a few interfaces.
Or, if you're open to a PR, I'd be happy to submit one. I would suggest using json.RawMessage for Input and Output. Then the user can handle marshaling however they need. That would be better for efficiency too -- I'm usually just passing them around and don't need to inspect them, so having unnecessary round trip marshaling to a map[string]any would not be desirable.
@robfig Yeah, we are definitely open to PRs, if you want to make a suggestion here, we can check it out as soon as it's up. You can tag me as a reviewer.
Here it is: https://github.com/deepgram/deepgram-go-sdk/pull/283
@jpvajda bump
Function Calling is now supported in this SDK.