databricks-sdk-go
databricks-sdk-go copied to clipboard
[FEATURE] Support Streaming Responses From ServingEndpoints.Query()
Problem Statement
The QueryEndpoint Input type supports a "Stream" boolean value. But if you set it to true, you get the following error:
panic: unexpected error handling request: invalid character 'd' looking for beginning of value. This is likely a bug in the Databricks SDK for Go or the underlying REST API. Please report this issue with the following debugging information to the SDK issue tracker at https://github.com/databricks/databricks-sdk-go/issues.
Proposed Solution Ideally the Go SDK could handle streaming responses from Model Serving endpoints.
Additional Context
Here is example code to recreate the issue:
package main
import (
"context"
"github.com/databricks/databricks-sdk-go"
"github.com/databricks/databricks-sdk-go/service/serving"
)
func main() {
w := databricks.Must(databricks.NewWorkspaceClient())
name := "databricks-meta-llama-3-70b-instruct"
var stop_sequences [1]string
stop_sequences[0] = "DONE"
var messages [3]serving.ChatMessage
messages[0] = serving.ChatMessage{Content: "Where should I go for a quick local vacation that is within 100 miles of my home?", Role: serving.ChatMessageRoleUser}
messages[1] = serving.ChatMessage{Content: "A quick local vacation sounds like just what you need! I'd be happy to help you find a great spot within 100 miles of your home.\n\nTo give you the best recommendations, could you please share your location or the city you're closest to?", Role: serving.ChatMessageRoleAssistant}
messages[2] = serving.ChatMessage{Content: "Yes, I'd be happy to share my location. I live in Seattle and prefer to do outdoor activities.", Role: serving.ChatMessageRoleUser}
response, err := w.ServingEndpoints.Query(context.Background(), serving.QueryEndpointInput{Name: name, Messages: messages[:], Stream: true, Stop: stop_sequences[:]})
if err != nil {
panic(err)
}
println(response)