gptel icon indicating copy to clipboard operation
gptel copied to clipboard

AWS Bedrock support

Open csheaff opened this issue 1 year ago • 21 comments

Hello, it would be grand to be able to use AWS models from Amazon Bedrock, such as Anthropic Claude Sonnet 3.5.

csheaff avatar Sep 11 '24 00:09 csheaff

Can you provide a link to their API documentation?

karthink avatar Sep 11 '24 00:09 karthink

...looking for a way to use just http requests but i'm not sure it's possible.

csheaff avatar Sep 11 '24 02:09 csheaff

I'm not familiar with AWS Bedrock. How do you access models (or other computation) running there?

karthink avatar Sep 11 '24 02:09 karthink

the easiest approaches are to use the aws cli on the command line or a python sdk. But i'm guessing what would be most convenient here is being able to send https requests with lisp.

relevant? https://github.com/pokepay/aws-sdk-lisp/issues/35

csheaff avatar Sep 11 '24 02:09 csheaff

This is as close as i can find to the payload structure:

https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html

On the command line one would would do:

aws bedrock-runtime converse \
--model-id amazon.titan-text-express-v1 \
--messages '[{"role": "user", "content": [{"text": "Describe the purpose of a \"hello world\" program in one line."}]}]' \
--inference-config '{"maxTokens": 512, "temperature": 0.5, "topP": 0.9}'

csheaff avatar Sep 11 '24 02:09 csheaff

This is as close as i can find to the payload structure:

https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html

This makes it seem like you can make http requests? Sorry, I'm not understanding how this service is structured.

If you can make http requests I can add support for it to gptel.

karthink avatar Sep 11 '24 02:09 karthink

yes i think so. one could authenticate using environmental variables

AWS_SESSION_TOKEN, AWS_SECRET_ACCESS_KEY, AWS_ACCESS_KEY_ID, AWS_REGION

...or just have the user enter them in during config. In my case i have to renew my credentials often for security reasons, so being able to update them without closing emacs and re-exporting the variables in the terminal would be a plus.

csheaff avatar Sep 11 '24 16:09 csheaff

Sorry, I don't follow how environment variables are relevant to making a http request.

Is there a curl command you can run to receive model responses from AWS Bedrock?

karthink avatar Sep 11 '24 21:09 karthink

Is there a curl command you can run to receive model responses from AWS Bedrock?

Curl has native support for the AWS signing method (see e.g. this article), so this should be possible. This reddit discussion seems to have a sample curl command to invoke Bedrock.

On a separate note, it would be nice to support CLI backends as well as REST HTTP backends, since then the work of actually calling a service can be offloaded to its native CLI tools (q chat, llama-cli, etc).

Note since I'm AWS employed: The above is purely my own knowledge and opinions, and not communication on behalf of my employer. This also applies to all future communications in this thread unless explicitly specified otherwise.

swapneils avatar Oct 15 '24 00:10 swapneils

@swapneils Thank you for the pointer -- this should be possible now.

@csheaff So this can be done using Curl, but someone will need to write an AWS bedrock backend for gptel. Unfortunately we can't inherit the OpenAI backend since the payload structure is different. PRs are welcome, you can copy gptel-openai.el or gptel-anthropic.el and modify it.

karthink avatar Oct 16 '24 21:10 karthink

thanks @karthink . I'll try to find some time but it might be tough.

csheaff avatar Oct 18 '24 23:10 csheaff

@karthink @csheaff Found this issue by accident. Maybe this will be useful for a future implementation >>> cl-bedrock? Includes support for InvokeModel, Converse and ApplyGuardrail APIs.

cl-bedrock-announcement

JGalego avatar Oct 29 '24 13:10 JGalego

I'm working on this and almost have a first version ready. However, I don't fully understand the point gptel--parse-buffer. Can you explain why that needs to be generic? Am I correct in understanding that it's only used in the chat buffer opened by M-x gptel?

felipeochoa avatar Feb 20 '25 22:02 felipeochoa

I'm working on this and almost have a first version ready. However, I don't fully understand the point gptel--parse-buffer. Can you explain why that needs to be generic?

@felipeochoa the input to gptel--parse-buffer is a buffer position to scan backwards from, up to the start of the (possibly narrowed) buffer. It collects user text and LLM responses into an array of messages and returns this array. The format of this array is API-specific, so each API uses a different implementation. The Bedrock API must have a specification too, you'll need to create an array of messages following this spec.

To see what the messages array looks like for the active backend, you can run (gptel--parse-buffer gptel-backend (point-max)) in a chat buffer.

Am I correct in understanding that it's only used in the chat buffer opened by M-x gptel?

It's used in all situations except when gptel-request is given an explicit prompt (string or list of strings) to use instead. By default, gptel does not distinguish between chat and non-chat buffers.

karthink avatar Feb 20 '25 22:02 karthink

Ah thanks for that explanation. That makes sense now. I didn't quite get to finish the stream implementation, and it's totally untested, but I think the main pieces are in place in f6b8f41. The one change to the internal API I had to make was to allow :curl-args to be a function so that the backend could inject the AWS credentials afresh each time. That's a small change in a7bd580.

If anyone understands how to handle the ConverseStream response (mime type application/vnd.amazon.eventstream it seems?), pointers would be helpful. ChatGPT insists that it's a plain JSON stream...

felipeochoa avatar Feb 21 '25 03:02 felipeochoa

Made a bit of progress today, 8e21474. Main thing missing is media handling and e2e testing!

~~@karthink if you have a chance to look at gptel-curl--parse-stream and how I'm using a marker to keep track of what's been parsed, I'd appreciate that. I think you do something similar using (point) instead of a separate marker. I opted for a new marker as a precaution against issues like #261. The main thing I'm unsure about is whether we can assume that every streaming response gets a fresh buffer, or whether buffers can be reused across requests.~~ (nvm, figured this out)

felipeochoa avatar Feb 22 '25 05:02 felipeochoa

Continuing to chip away at this in 65079f8. Good news is that basic messages and streaming work! I'm debugging tool use, and have half the media handling set up. I did have to add one more change to the internal API to allow running the curl process in 'binary coding. That's in 2378b96

felipeochoa avatar Feb 26 '25 03:02 felipeochoa

PR at #670

felipeochoa avatar Feb 28 '25 01:02 felipeochoa

~~@karthink if you have a chance to look at gptel-curl--parse-stream and how I'm using a marker to keep track of what's been parsed, I'd appreciate that. I think you do something similar using (point) instead of a separate marker. I opted for a new marker as a precaution against issues like #261. The main thing I'm unsure about is whether we can assume that every streaming response gets a fresh buffer, or whether buffers can be reused across requests.~~ (nvm, figured this out)

I'm looking at it now. I don't think you need gptel-bedrock--stream-cursor or for it to be a marker, but there's no harm I guess.

karthink avatar Mar 13 '25 05:03 karthink

~~@karthink if you have a chance to look at gptel-curl--parse-stream and how I'm using a marker to keep track of what's been parsed, I'd appreciate that. I think you do something similar using (point) instead of a separate marker. I opted for a new marker as a precaution against issues like #261. The main thing I'm unsure about is whether we can assume that every streaming response gets a fresh buffer, or whether buffers can be reused across requests.~~ (nvm, figured this out)

I'm looking at it now. I don't think you need gptel-bedrock--stream-cursor or for it to be a marker, but there's no harm I guess.

You can also use (process-mark).

karthink avatar Mar 13 '25 05:03 karthink

Would be great to have this feature added to gptel. Also using this comment as an opportunity to thank everyone who is involved in development of gptel and especially @karthink . You guys are amazing <3

pavloo avatar Apr 13 '25 19:04 pavloo

Now that work on this feature is continuing again:

I'm happy to merge this into gptel, but I'm going to be unable to provide support since (i) I don't have access to a model hosted on AWS bedrock, and (ii) don't know how the API works. (The protocol is different enough from other backends that I can't just wing it.) So if anyone raises an issue, bug report or support request involving gptel-bedrock, it will be up to @felipeochoa, @akssri or one of the other users of this feature to help. The code will likely rot over time without your help maintaining it.

Alternatively, this package can be owned by one of you, and I can (i) provide instructions in the README for obtaining and using gptel-bedrock, and (ii) raise a PR on your repo if there is a breaking change in the underlying gptel code. I can also help you add the package to MELPA and NonGNU ELPA, but I can't help update it re: changes to or bugs in the Bedrock API handling.

In either case, the few changes required for gptel-bedrock to work in the rest of gptel can be added to gptel.

Please let me know your thoughts.

karthink avatar May 23 '25 22:05 karthink

I'd happy to help maintain this (atleast as long as I have access to AWS Bedrock at work); I'd strongly prefer keeping it here.

akssri avatar May 23 '25 22:05 akssri

I'd happy to help maintain this (atleast as long as I have access to AWS Bedrock at work);

That's great! 👍

I'd strongly prefer keeping it here.

I'm waiting to hear from the other contributors. Unless I get a job where I have to use LLMs via AWS Bedrock and can use them from Emacs, I'm going to be no help at all.

karthink avatar May 30 '25 06:05 karthink

Support for AWS Bedrock has been merged in #867 (even though GitHub reports that the PR was closed).

Many thanks to @felipeochoa and @akssri for adding the feature, and to everyone in this thread for helping test it.


However, there is a caveat:

As mentioned above, I won't be able to maintain gptel-bedrock.el, except for adjustments relating to internal API changes on the gptel side.

Currently @akssri has generously agreed to maintain it. If gptel-bedrock is unmaintained in the future, I might have to remove it from gptel.

karthink avatar May 31 '25 05:05 karthink

One note to others trying to get AWS bedrock to work: You may need curl 8.14+. Older versions of curl (8.7.1, for example) may produce HTTP 403 invalid signature errors. https://github.com/curl/curl/pull/17129 is included in curl 8.14.

aronatkins avatar Jun 09 '25 15:06 aronatkins

Yeah I need to increase the min. version; a colleague of mine found that nothing below 8.9.1 works.

On ಜೂನ್ 10, 2025 12:28:15 ಪೂರ್ವಾಹ್ನ GMT+09:00, Aron Atkins @.***> wrote:

aronatkins left a comment (karthink/gptel#379)

One note to others trying to get AWS bedrock to work: You may need curl 8.14+. Older versions of curl (8.7.1, for example) may produce HTTP 403 invalid signature errors. https://github.com/curl/curl/pull/17129 is included in curl 8.14.

-- Reply to this email directly or view it on GitHub: https://github.com/karthink/gptel/issues/379#issuecomment-2956134998 You are receiving this because you were mentioned.

Message ID: @.***>

  • ಅಕ್ಷಯ

akssri avatar Jun 09 '25 15:06 akssri

Thanks @akssri for getting this over the line!

@karthink feel free to tag me whenever a bedrock bug comes through. Can't promise a speedy response, but I'll do my best to look into it

felipeochoa avatar Jun 16 '25 20:06 felipeochoa