companion-app icon indicating copy to clipboard operation
companion-app copied to clipboard

feat: Multimodal block display + Steamship agent

Open eob opened this issue 1 year ago • 5 comments

This PR adds two separate (but related) features:

  1. Basic support in the UI for multimodal blocks (think Notion-style) as the agent response type. String completions, as provided before, are rendered as a text block.
  2. A Steamship companion of Rick (from Rick & Morty), which supports audio and image responses in addition to text.

Note: We've wire this agent to a hosted copy of Rick w/ custom voice & character knowledge, for which you'll need an API key to access -- we can DM that key for your use.

image

eob avatar Jul 12 '23 13:07 eob

Thanks so much for this @eob! I tried it out on codespaces but for some reason the audio / image generation didn't work. I can check out the branch locally to see why thats the case.

Would be super helpful if you could add more instructions to creating a user's own agent on steamship / have a ready-to-use agent people can deploy easily when they first launch this app too!

ykhli avatar Jul 12 '23 23:07 ykhli

Hi @ykhli -- thanks for taking a look!

This PR is tied to an agent already deployed & instantiated, but we're about to send another PR to the python branch on this project which contains exactly what you're asking for above. (@EniasCailliau has it pretty close). That PR will work with the backstory bootstrapping code as well, which is an extra nice bit.

Maybe the best thing to do is await that PR and then I update this PR to match? That way you'd be 100% in control of the agent end-to-end, using code only in this repo, rather than relying on a "magic URL" that I gave you.

eob avatar Jul 13 '23 09:07 eob

@ykhli I've added a PR that adds more instructions for creating a user's own agent on Steamship. Users will be able to either use the pre-deployed instance featured in this PR or can instantiate their own companions with the instructions in the readme.

Here's the PR #57

EniasCailliau avatar Jul 14 '23 12:07 EniasCailliau

@ykhli audio/image generation is now fixed. Sorry for the confusion!

image

EniasCailliau avatar Jul 14 '23 12:07 EniasCailliau

Hi @ykhli,

Totally our fault re: the images and audio not showing up! We pushed a typo up to the agent we had wired up to it.

  • That bug is now fixed
  • I swapped out the md5 library used in this branch to a typescript native one.

Here's a test I just did:

image

This PR is basically your advice number 2, on #53 --

It implements support for fixed a steamship endpoint --- just like the openai or vicuna13b --- that's an agent already running elsewhere.

How to run

  1. Pull down the update to this branch
  2. Make sure you have the STEAMSHIP_API_KEY variable in .env.local set to the one that I gave you on a separate channel.
# Steamship related environment variables
STEAMSHIP_API_KEY=
  1. Try saying hi to rick, then ask him to send a picture of something.

Latency

The latency is a bit long with pictures. We did some profiling and it's due to GPT4 + Eleven Labs + Stable Diffusion just adding up to a lot serially. We're working on streaming support and can merge it in when it's done.

Let me know if this works for you! Once we have this in, the other two PRs that Enias sent can be used to deploy agents that can be interacted with in this way.

eob avatar Jul 14 '23 13:07 eob

Thanks for the feedback! Merged in your suggestions and pulled the telegram link on all of them for now. We can send another PR that adds it back in, conditionally on the field being present, later if there's interest.

eob avatar Jul 24 '23 17:07 eob

@jenniferli23 one last update -- Enias did some prompt engineering to get the image quality much better, and added in the telegram link for just the Rick bot.

I think this is ready to merge, but lmk if you need anymore changes!

image

eob avatar Jul 26 '23 13:07 eob