I don't receive the answer as a stream when using Ollama.
I want to create a chatbot that answers in real-time or streams responses like ChatGPT. However, I'm having trouble getting the answer stream to work correctly. I'm using embeddings, and my only issue is with the streaming of the responses. At the moment, I tried using it on the console, but it only gives the complete answer instead of streaming it.
Console:
console: Enter your prompt: hi Hello! It's nice to meet you. Is there something I can help you with or would you like to chat?
Code:
<?php
require 'vendor/autoload.php';
use LLPhant\OllamaConfig;
use LLPhant\Chat\OllamaChat;
$config = new OllamaConfig();
$config->model = 'llama2';
$config->stream = true;
$chat = new OllamaChat($config);
$prompt = readline("Enter your prompt: ");
$responseStream = $chat->generateStreamOfText($prompt);
foreach ($responseStream as $response) {
echo $response . PHP_EOL;
}
Hey @santiOcampo01 , did you succeed to make it stream? Did you try with QuestionAnswering class ?
Since the package wait the complete response from model, for now there is no way to stream response (like ollama did on the cli)
@MaximeThoonsen any thought of replacing the ollama client code with some external libraries? there are a few ollama php clients I could find that support streamed response
you may want to give https://github.com/LLPhant/LLPhant/pull/298 a try, with this changes I'm able to stream the contents while being sent:
try {
$stream = $chat->generateStreamOfText(implode($mappedTextPrompts));
// $stream = $chat->generateChatStream($mappedPrompts);
} catch (\Throwable $e) {
yield 'AI error: ' . $e->getMessage();
return;
}
while (!$stream->eof()) {
$t = $stream->read(32);
yield $t;
}