cody Chat: refactor Ollama chat client

This PR fixes the issue of using PromptString as prompt text in the messages we send to the LLM, causing regression in chat response from some Ollama models.
Also updated the chat client for Ollama and Groq to have better error handling:
Log usage on complete for easier debugging purpose

Ollama:

Groq:

Test plan

Run this branch in debug mode and follow our Ollama docs to set up Ollama with Cody
Try asking Cody a question

Before

After

Apr 20 '24 03:04 abeatrix

@dominiccooney i've applied your feedback to the PR:

[x] Use a TextDecoderStream
[x] Accumulate the decoded text into a string
[x] Break on newlines
[x] JSON.parse each chunk and act on it
[x] Don't assume that, having seen the end of the network response, that you have decoded all the characters in the response yet; the last packet may contain data.
- process if (value) where value will be decoded first, before checking for if (done)

Apr 30 '24 00:04 abeatrix

@dominiccooney @valerybugakov I've updated the PR to use the official Ollama js package for chat instead of implementing our own client. May I get your review again please?

Jun 05 '24 22:06 abeatrix

Verified Ollama still works with the latest commit:

Jun 06 '24 00:06 abeatrix