pg_vectorize increase embedding provider support

add support for embeddings from:

aws bedrock
google vertex
azure ai
voyage

Embedding providers are added by implementing the required traits. For example see the implementation for Cohere

Oct 12 '24 14:10 ChuckHend

not only embedding but also other things beside OLLAMA like calling api can help a lot, for example I can not use it with persian embedding and some other models.

Oct 14 '24 14:10 tavallaie

We do have support for Ollama (thanks to @destrex271) - but we are pretty light on documentation on using it is.

https://github.com/tembo-io/pg_vectorize/blob/main/core/src/transformers/providers/ollama.rs

Oct 14 '24 14:10 ChuckHend

can we change this

We do have support for Ollama (thanks to @destrex271) - but we are pretty light on documentation on using it is.

https://github.com/tembo-io/pg_vectorize/blob/main/core/src/transformers/providers/ollama.rs

can you change this pub const OLLAMA_BASE_URL: &str = "http://localhost:3001"; to env variables and optional apikey ? so we can support other online ollama like compatible webservices

Oct 14 '24 14:10 tavallaie

Yes it can be changed. It is a configuration is Postgres:

When you run this with docker-compose it gets set to the docker service name by default.

postgres=# show vectorize.ollama_service_url ;
   vectorize.ollama_service_url   
----------------------------------
 http://ollama-serve:3001/v1/chat
(1 row)

but it can be set to whatever value you want, e.g. ALTER SYSTEM SET vectorize.ollama_service_url to 'https://www.myservice.ai/embeddings'. The assumption here is that whatever service running at vectorize.ollama_service_url has an API (request and response) schema exactly like Ollama's.

Similarly, you can change the OpenAI url so long as the server running has the same API schema as OpenAI

postgres=# show vectorize.openai_service_url ;
 vectorize.openai_service_url 
------------------------------
 https://api.openai.com/v1
(1 row)

All the Postgres settings that can be changed are defined here: https://github.com/tembo-io/pg_vectorize/blob/main/extension/src/guc.rs

Oct 14 '24 14:10 ChuckHend

Let me try it

Oct 14 '24 20:10 tavallaie

Let me know if you run into any issues. Can ping me here, or in the Tembo community slack.

Oct 15 '24 21:10 ChuckHend

💎 $150 bounty • Tembo

Steps to solve:

Start working: Comment /attempt #152 with your implementation plan
Submit work: Create a pull request including /claim #152 in the PR body to claim the bounty
Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to tembo-io/pg_vectorize!

Add a bounty • Share on socials

Attempt	Started (GMT+0)	Solution
🟢 @palash25	Oct 27, 2024, 10:25:12 AM	#174

Oct 17 '24 12:10 algora-pbc[bot]

/attempt #152

Options

Cancel my attempt

Oct 27 '24 10:10 palash25

@palash25 please check #169 too. maybe your experience and attempt can help us with that issue too.

Oct 27 '24 12:10 tavallaie

💡 @palash25 submitted a pull request that claims the bounty. You can visit your bounty board to reward.

Oct 27 '24 17:10 algora-pbc[bot]

hi @ChuckHend is the bounty per provider API or for all of 4 of the implementations? asking so that I can know if you expect smaller individual PRs for each API provider or one large PR containing all the implementations?

For now I have created a draft for the Voyage API just for visibility so that others know I am working on it and there is no duplication.

Oct 27 '24 17:10 palash25

The bounty will be for a single provider. We will need to open new bounties for each new provider though. I think going forward we'll have bounties for specific providers by name.

Oct 30 '24 19:10 ChuckHend

and yes, as @tavallaie mentioned there is some discussion over in https://github.com/tembo-io/pg_vectorize/issues/169 where we hopefully can make it such that you can add new embedding providers by inserting rows to a table.

Oct 30 '24 19:10 ChuckHend

🎉🎈 @palash25 has been awarded $150! 🎈🎊

Nov 05 '24 15:11 algora-pbc[bot]

any future work related to model provider support will fall under https://github.com/tembo-io/pg_vectorize/issues/169

Nov 05 '24 15:11 ChuckHend