cody
cody copied to clipboard
Chat: refactor Ollama chat client
-
This PR fixes the issue of using PromptString as prompt text in the messages we send to the LLM, causing regression in chat response from some Ollama models.
-
Also updated the chat client for Ollama and Groq to have better error handling:
-
Log usage on complete for easier debugging purpose
Ollama:
Groq:
Test plan
- Run this branch in debug mode and follow our Ollama docs to set up Ollama with Cody
- Try asking Cody a question
Before
After
@dominiccooney i've applied your feedback to the PR:
- [x] Use a TextDecoderStream
- [x] Accumulate the decoded text into a string
- [x] Break on newlines
- [x] JSON.parse each chunk and act on it
- [x] Don't assume that, having seen the end of the network response, that you have decoded all the characters in the response yet; the last packet may contain data.
- process
if (value)where value will be decoded first, before checking forif (done)
- process
@dominiccooney @valerybugakov I've updated the PR to use the official Ollama js package for chat instead of implementing our own client. May I get your review again please?
Verified Ollama still works with the latest commit: