LLMUnity icon indicating copy to clipboard operation
LLMUnity copied to clipboard

More “Stream” Options

Open Phoseele opened this issue 10 months ago • 1 comments

Describe the feature

When i ask for a response with stream option is on, seems like it will call string concat for every character generation, and if it's a long response, it will casue a lot of GC, means a serious performance problem. So I‘m thinking is it possible to add a option that can just return the newest character generated instead of entire response made by string concat? Then, developer can decide how to use the character generated. Or, add a overload function "LLMCharacter.Chat" change parameter type from "Callback string" to "Callback StringBuilder", and add a parameter to receive a outer string builder, can avoid string concat GC. Image Image

Phoseele avatar Feb 20 '25 16:02 Phoseele

Yes this can be done, it needs a bit of engineering on the LlamaLib side

amakropoulos avatar Feb 20 '25 21:02 amakropoulos